word-boundary

How can I make a regular expression which takes accented characters into account?

狂风中的少年 提交于 2019-11-29 10:40:40
I have a JavaScript regular expression which basically finds two-letter words. The problem seems to be that it interprets accented characters as word boundaries. Indeed, it seems that A word boundary ("\b") is a spot between two characters that has a "\w" on one side of it and a "\W" on the other side of it (in either order), counting the imaginary characters off the beginning and end of the string as matching a "\W". AS3 RegExp to match words with boundry type characters in them And since \w matches any alphanumerical character (word characters) including underscore (short for [a-zA-Z0-9_]).

How can I make a regular expression which takes accented characters into account?

被刻印的时光 ゝ 提交于 2019-11-28 04:04:35
问题 I have a JavaScript regular expression which basically finds two-letter words. The problem seems to be that it interprets accented characters as word boundaries. Indeed, it seems that A word boundary ("\b") is a spot between two characters that has a "\w" on one side of it and a "\W" on the other side of it (in either order), counting the imaginary characters off the beginning and end of the string as matching a "\W". AS3 RegExp to match words with boundry type characters in them And since \w

MySQL REGEXP word boundaries [[:<:]] [[:>:]] and double quotes

爷,独闯天下 提交于 2019-11-28 01:06:51
I'm trying to match some whole-word-expressions with the MySQL REGEXP function. There is a problem, when there are double quotes involved. The MySQL documentation says: "To use a literal instance of a special character in a regular expression, precede it by two backslash () characters." But these queries all return 0: SELECT '"word"' REGEXP '[[:<:]]"word"[[:>:]]'; -> 0 SELECT '"word"' REGEXP '[[:<:]]\"word\"[[:>:]]'; -> 0 SELECT '"word"' REGEXP '[[:<:]]\\"word\\"[[:>:]]'; -> 0 SELECT '"word"' REGEXP '[[:<:]] word [[:>:]]'; -> 0 SELECT '"word"' REGEXP '[[:<:]][[.".]]word[[.".]][[:>:]]'; -> 0

Javascript - regex - word boundary (\\b) issue

有些话、适合烂在心里 提交于 2019-11-27 22:53:58
I have a difficulty using \b and greek characters in a regex. At this example [a-zA-ZΆΈ-ώἀ-ῼ]* succeeds to mark all the words I want (both greek and english). Now consider that I want to find words with 2 letters. For the English language I use something like this: \b[a-zA-Z]{2}\b . Can you help me write a regex that succeeds to mark words in Greek with 2 letters? (Why? My final goal is to remove them). text used: Greek MONOTONIC: Το γάρ ούν και παρ' υμίν λεγόμενον, ώς ποτε Φαέθων Ηλίου παίς το του πατρός άρμα ζεύξας δια το μή δυνατός είναι κατά την του πατρός οδόν ελαύνειν τα τ' επί της γής

PostgreSQL Regex Word Boundaries?

守給你的承諾、 提交于 2019-11-27 18:16:45
Does PostgreSQL support \b ? I'm trying \bAB\b but it doesn't match anything, whereas (\W|^)AB(\W|$) does. These 2 expressions are essentially the same, aren't they? Daniel Vandersluis PostgreSQL uses \m , \M , \y and \Y as word boundaries: \m matches only at the beginning of a word \M matches only at the end of a word \y matches only at the beginning or end of a word \Y matches only at a point that is not the beginning or end of a word See Regular Expression Constraint Escapes in the manual. There is also [[:<:]] and [[:>:]] , which match the beginning and end of a word. From the manual :

Javascript - regex - word boundary (\b) issue

两盒软妹~` 提交于 2019-11-27 04:36:37
问题 I have a difficulty using \b and greek characters in a regex. At this example [a-zA-ZΆΈ-ώἀ-ῼ]* succeeds to mark all the words I want (both greek and english). Now consider that I want to find words with 2 letters. For the English language I use something like this: \b[a-zA-Z]{2}\b . Can you help me write a regex that succeeds to mark words in Greek with 2 letters? (Why? My final goal is to remove them). text used: Greek MONOTONIC: Το γάρ ούν και παρ' υμίν λεγόμενον, ώς ποτε Φαέθων Ηλίου παίς

What are non-word boundary in regex (\\B), compared to word-boundary?

依然范特西╮ 提交于 2019-11-27 03:45:59
What are non-word boundary in regex (\B), compared to word-boundary? A word boundary ( \b ) is a zero width match that can match: Between a word character ( \w ) and a non-word character ( \W ) or Between a word character and the start or end of the string. In Javascript the definition of \w is [A-Za-z0-9_] and \W is anything else. The negated version of \b , written \B , is a zero width match where the above does not hold. Therefore it can match: Between two word characters. Between two non-word characters. Between a non-word character and the start or end of the string. The empty string. For

Oracle REGEXP_LIKE and word boundaries

我与影子孤独终老i 提交于 2019-11-27 03:43:10
I am having a problem with matching word boundaries with REGEXP_LIKE. The following query returns a single row, as expected. select 1 from dual where regexp_like('DOES TEST WORK HERE','TEST'); But I want to match on word boundaries as well. So, adding the "\b" characters gives this query select 1 from dual where regexp_like('DOES TEST WORK HERE','\bTEST\b'); Running this returns zero rows. Any ideas? I believe you want to try select 1 from dual where regexp_like ('does test work here', '(^|\s)test(\s|$)'); because the \b does not appear on this list: http://download.oracle.com/docs/cd/B19306

detecting word boundary with regex in data frame in R

梦想的初衷 提交于 2019-11-26 21:52:51
问题 I have a data.frame named all that has a column of factors, these factors include "word" , "nonword" and some others. My goal is to select only the rows that have the factor value "word". My solution grep("\bword\b",all[,5]) returns nothing. How come word boundaries are not recognized? 回答1: In R, you need two times \ : grep("\\bword\\b", all[5]) Alternative solutions: grep("^word$", all[5]) which(all[5] == "word") 来源: https://stackoverflow.com/questions/17906003/detecting-word-boundary-with

utf-8 word boundary regex in javascript

只愿长相守 提交于 2019-11-26 12:28:47
In JavaScript: "ab abc cab ab ab".replace(/\bab\b/g, "AB"); correctly gives me: "AB abc cab AB AB" When I use utf-8 characters though: "αβ αβγ γαβ αβ αβ".replace(/\bαβ\b/g, "AB"); the word boundary operator doesn't seem to work: "αβ αβγ γαβ αβ αβ" Is there a solution to this? The word boundary assertion does only match if a word character is not preceded or followed by another word character (so .\b. is equal to \W\w and \w\W ). And \w is defined as [A-Za-z0-9_] . So \w doesn’t match greek characters. And thus you cannot use \b for this case. What you could do instead is to use this: "αβ αβγ