regex-lookarounds

Regex to match whatsapp chat log

扶醉桌前 提交于 2019-12-06 07:20:58
I've been trying to create Regex for WhatsApp chat log. So far I've been able to achieve this Click Here for the test link By creating the following Regex: (?P<datetime>\d{2}\/\d{2}\/\d{4},\s\d(?:\d)?:\d{2} [pa].m.)\s-\s(?P<name>[^:]*):(?P<message>.*) The problem with this regex is, it is not able to match big messages which span multiple lines with line breaks. You can see the issue in the link provided above. Help would be appreciated. Thank you. There you go: ^ (?P<datetime>\d{2}/\d{2}/\d{4}[^-]+)\s+-\s+ (?P<name>[^:]+):\s+ (?P<message>[\s\S]+?) (?=^\d{2}|\Z) See your modified demo on

What is an alternative for lookbehind with C++ RegEx?

ぐ巨炮叔叔 提交于 2019-12-06 06:06:01
问题 I am using the following pattern (?<=<)(?<!>).*?q.*?(?!<)(?=>) which uses positive and negative lookahead and lookbehind to match the literal q that is enclosed in matching brackets. C++ RegEx does not support lookbehind what would be a good alternative? Thanks! 回答1: Note that (?<=<)(?<!>) is equal to (?<=<) (since a < is required immediately to the left of the current location, there cannot be any > ) and (?!<)(?=>) is equal to (?=>) (same logic applies here, as > must be immediately to the

Regex PHP. Reduce steps: limited by fixed width Lookbehind

天涯浪子 提交于 2019-12-06 04:44:24
I have a regex that will be used to match @users tags. I use lokarround assertions, letting punctuation and white space characters surround the tags. There is an added complication, there are a type of bbcodes that represent html. I have two types of bbcodes, inline ( ^B bold ^b ) and blocks ( ^C center ^c ). The inline ones have to be passed thru to reach for the previous or next character. And the blocks are allowed to surround a tag, just like punctuation. I made a regex that does work. What I want to do now is to lower the number of steps that it does in every character that’s not going to

RegEx for adding underscore before capitalized letters

若如初见. 提交于 2019-12-06 01:34:33
How do I add underscore (_) before capitalized letters in a string, excepted the first one ? [1] "VarLengthMean" "VarWidthMean" I want it to become : [1] "Var_Length_Mean" "Var_Width_Mean" I considered using str_replace_all from stringr , but I can't figure out which regexp I should use. How do I solve this problem? akrun One option would be to capture the lower case letter and the following upper case letter, and then insert the _ while adding the backreference ( \\1 , \\2 ) of the captured group sub("([a-z])([A-Z])", "\\1_\\2", v1) #[1] "Var_Length" "Var_Width" If there are more instances,

Nested regex lookahead and lookbehind

南笙酒味 提交于 2019-12-05 18:28:51
问题 I am having problems with the nested '+'/'-' lookahead/lookbehind in regex. Let's say that I want to change the '*' in a string with '%' and let's say that '\' escapes the next character. (Turning a regex to sql like command ^^). So the string '*test*' should be changed to '%test%' , '\\*test\\*' -> '\\%test\\%' , but '\*test\*' and '\\\*test\\\*' should stay the same. I tried: (?<!\\)(?=\\\\)*\* but this doesn't work (?<!\\)((?=\\\\)*\*) ... (?<!\\(?=\\\\)*)\* ... (?=(?<!\\)(?=\\\\)*)\* ...

Regex match everything between two {}

橙三吉。 提交于 2019-12-05 17:02:20
I was looking at different answers here but unfortunately none of them was good for my case. So I hope you don't mind about it. So I need to match everything between two curly brackets {} except situation when match starts with @ and without these curly brackets e.g: "This is a super text { match_this }" "{ match_this }" "This is another example @{deal_with_it}" Here are my test strings, 1,2,3 are valid while the last one shouldn't be: 1 {eww} 2 r23r23{fetwe} 3 #{d2dded} 4 @{d2dded} I was trying with: (?<=[^@]\{)[^\}]*(?=\}) Then only 2th and 3th options were matches (without the first one)

JavaScript: The Good Parts; why is lookahead not good?

眉间皱痕 提交于 2019-12-05 13:25:21
问题 I'm reading Douglas Crockfords Javascript: The Good Parts, I just finished the regular expressions chapter. In this chapter he calls JavaScript's \b , positive lookahead (?=) and negative lookahead (?!) "not a good part" He explains the reason for \b being not good (it uses \w for word boundary finding, and \w fails for any language that uses unicode characters), and that looks like a very good reason to me. Unfortunately, the reason for positive and negative lookahead being not good is left

Java regex with a positive look behind of a negative look ahead

£可爱£侵袭症+ 提交于 2019-12-05 11:33:08
I am trying to extract from this kind of string ou=persons,ou=(.*),dc=company,dc=org the last string immediately preceded by a coma not followed by (.*). In the last case, this should give dc=company,dc=org . Looking on regex, this seems to be a positive look behind (preceded by) of a negative look ahead. So I have achieve this regex: (?<=(,(?!.*\Q(.*)\E))).* , but it returns ,dc=company,dc=org with the coma. I want the same thing without the coma. What I am doing wrong? The comma appears because the capturing group contains it. You can make the outside capture group noncapturing with (?:) (?<

Regex lookahead

喜欢而已 提交于 2019-12-05 10:31:00
I am using a regex to find: test:? Followed by any character until it hits the next: test:? Now when I run this regex I made: ((?:test:\?)(.*)(?!test:\?)) On this text: test:?foo2=bar2&baz2=foo2test:?foo=bar&baz=footest:?foo2=bar2&baz2=foo2 I expected to get: test:?foo2=bar2&baz2=foo2 test:?foo=bar&baz=foo test:?foo2=bar2&baz2=foo2 But instead it matches everything. Does anyone with more regex experience know where I have gone wrong? I've used regexes for pattern matching before but this is my first experience of lookarounds/aheads. Thanks in advance for any help/tips/pointers :-) I guess you

find search item plus 4 lines before and after

巧了我就是萌 提交于 2019-12-04 15:50:40
I am using notepad++ and would like to find the context in which a particular string occurs. So the search string is 0wh.*0subj and I would like to find this search item plus 4 lines immediately before and after it. eg: xxx means whatever is on a new line. the search result should be: xxx xxx xxx xxx 0wh.*0subj xxx xxx xxx xxx I have tried using \n\r but its not working. Any assistance afforded would be greatly appreciated. Regards This will work in Notepad++ (tested): (?m)(^[^\r\n]*\R+){4}0wh\.\*0subj[^\r\n]*\R+(^[^\r\n]*\R+){4} On the screenshot, note that the 555 line is not selected. It is