word-boundary | 易学教程

How to match the first word after an expression with regex?

阅读更多关于 How to match the first word after an expression with regex?

问题 For example, in this text: Lorem ipsum dolor sit amet, consectetur adipiscing elit. Nunc eu tellus vel nunc pretium lacinia. Proin sed lorem. Cras sed ipsum. Nunc a libero quis risus sollicitudin imperdiet. I want to match the word after 'ipsum'. 回答1: This sounds like a job for lookbehinds, though you should be aware that not all regex flavors support them. In your example: (?<=\bipsum\s)(\w+) This will match any sequence of letter characters which follows "ipsum" as a whole word followed by

What is a word boundary in regexes?

阅读更多关于 What is a word boundary in regexes?

问题 I am using Java regexes in Java 1.6 (inter alia to parse numeric output) and cannot find a precise definition of \b ("word boundary"). I had assumed that -12 would be an "integer word" (matched by \b\-?\d+\b ) but it appears that this does not work. I'd be grateful to know of ways of matching space-separated numbers. Example: Pattern pattern = Pattern.compile("\\s*\\b\\-?\\d+\\s*"); String plus = " 12 "; System.out.println(""+pattern.matcher(plus).matches()); String minus = " -12 "; System

JavaScript regular expression for word boundaries, tolerating in-word hyphens and apostrophes

阅读更多关于 JavaScript regular expression for word boundaries, tolerating in-word hyphens and apostrophes

问题 I'm looking for a Regular Expression for JavaScript that will identify word boundaries in English, while accepting hyphens and apostrophes that appear inside words, but excluding those that appear alone or at the beginning or end of a word. For example, for the sentence ... She said - 'That'll be all, Two-Fry.' ... I want the characters shown in grey below to be detected: She said - ' That'll be all , Two-Fry .' If I use the regex /[^A-Za-z'-]/g , then "loose" hyphens and apostrophes are not

mysql: instr specify word boundaries

阅读更多关于 mysql: instr specify word boundaries

问题 i want to check if a string contains a field value as a substring or not. select * from mytable where instr("mystring", column_name); but this does not search on word boundaries. select * from mytable where instr("mystring", concat('[[:<:]]',column_name,'[[:>:]]'); does not work either. how to correct this? 回答1: You can do this using the REGEXP operator: SELECT * FROM mytable WHERE 'mystring' REGEXP CONCAT('[[:<:]]', column_name, '[[:>:]]'); Note, however, that this is slow . You might be

Ellipsis After Certain Number or Characters with Word Boundaries

阅读更多关于 Ellipsis After Certain Number or Characters with Word Boundaries

问题 I'm trying to put an ellipsis (…) to shorten long descriptions and want to have word boundaries. Here's my current code eval.in: # Assume $body is a long text. $line = $body; if(strlen($body) > 300 && preg_match('/^.{1,300}\b/su', $body, $match)) { $line = trim($match[0]) . "…"; } echo $line; This actually works pretty well and I like it except that there are times when the word boundary has a punctuation after it. If I use the code above, I get results like the following: This is a long

Word boundary regex issue

阅读更多关于 Word boundary regex issue

问题 I'm having issues using word boundaries \b in my regular expression. I'm using R but the issue exists as well when I try http://regexr.com. The pattern I'm using is \bs\.l\.\b , and while I expected lines 1 and 3 below to match this pattern, only line 2 matches: aaa s.l. bbb aaa s.l.bbb aaa s.l., bbb See http://regexr.com/3f154 as well. 回答1: The word boundaries match in the following positions: Before the first character in the string, if the first character is a word character. After the

What is a word boundary in regexes?

阅读更多关于 What is a word boundary in regexes?

Regular Expression to match word repeated twice (i.e. hello hello hello)

阅读更多关于 Regular Expression to match word repeated twice (i.e. hello hello hello)

问题 I have a java regular expression given by my CS2 instructor that checks if a word is repeated: \\b(\\w+)\\s+\\1\\b How can I modify this to check if a word is repeated twice as in "hello hello hello" or "hello world hello hello" If possible, I'd just like to be pointed in the right direction, not an outright solution (after all, I need to learn this). I think my problem is that I don't understand word boundaries well. 回答1: Well, since you seem to want to learn this yourself I'll give you a

AS3 RegExp to match words with boundry type characters in them

阅读更多关于 AS3 RegExp to match words with boundry type characters in them

问题 I'm wanting to match a list of words which is easy enough when those words are truly words. For example /\b (pop|push) \b/gsx when ran against the string pop gave the door a push but it popped back will match the words pop and push but not popped. I need similar functionality for words that contain characters that would normally qualify as word boundaries. So I need /\b (reverse!|push) \b/gsx when ran against the string push reverse! reverse!push to only match reverse! and push but not match

How can I find repeated words in a file using grep/egrep?

阅读更多关于 How can I find repeated words in a file using grep/egrep?

问题 I need to find repeated words in a file using egrep (or grep -e) in unix (bash) I tried: egrep "(\<[a-zA-Z]+\>) \1" file.txt and egrep "(\b[a-zA-Z]+\b) \1" file.txt but for some reason these consider things to be repeats that aren't! for example, it thinks the string "word words" meets the criteria despite the word boundary condition \> or \b . 回答1: \1 matches whatever string was matched by the first capture. That is not the same as matching the same pattern as was matched by the first