regex-lookarounds | 易学教程

RegEx for capturing a repeating pattern

阅读更多关于 RegEx for capturing a repeating pattern

问题 I have the following regex from regex capturing with repeating pattern ([0-9]{1,2}h)[ ]*([0-9]{1,2}min):[ ]*(.*(?:\n(?![0-9]{1,2}h).*)*) It takes the following string 1h 30min: Title - Description Line 1 1h 30min: Title - Description Line 1 - Description Line 2 - Description Line 3 And produces this as a result Match 1: "1h 30min: Title - Description Line 1" Group 1: "1h" Group 2: "30min" Group 3: "Title - Description Line 1" Match 2: "1h 30min: Title - Description Line 1 - Description Line 2

Regex to match if given text is not found and match as little as possible

阅读更多关于 Regex to match if given text is not found and match as little as possible

问题 I have text: <a> sdfsdf DDzz sdfsdf </a> <a> sdfsdf DDzz sdfsdf </a> <a> sdfsdf BBzz sdfsdf </a> <a> sdfsdf DDzz sdfsdf </a> I can't parse it as xml. I need to use regex here. Also this is only example. I want regex that can match every group <a>...</a> that does not contain element b with text that starts with BB . I came up with this regex: <a>.*?(?!B).*?.*?</a> But it matches last group as: <a> sdfsdf BBzz sdfsdf </a> <a> sdfsdf DDzz sdfsdf

Regex to extract cents value from arbitrary currency formatting

阅读更多关于 Regex to extract cents value from arbitrary currency formatting

问题 I need to extract cents value from next possible values using Java regex (thousand separator could be both dot and comma): $123,456.78 123,456.78 dollars 123,456.78 I have partially working solution: [\.,]\d\d\D The problem with my solution, that it doesn't work in case "123,456.78" when the last digit is the end of string. How can I handle this case? http://java-regex-tester.appspot.com/regex/6af08221-63cb-4c5b-a865-c86fe5e825ff 回答1: Note that \D requires a character that is not a digit

How to write optional word in Regular Expression?

阅读更多关于 How to write optional word in Regular Expression?

问题 I want to write a java Regular expression that recognises the following patterns. abc def the ghi and abc def ghi I tried this: abc def (the)? ghi But, it is not recognizing the second pattern.Where am I going wrong? 回答1: abc def (the )?ghi ^^ Remove the extra space 回答2: Spaces are also valid characters in regex, so abc def (the)? ghi ^ ^ --- spaces can match only abc def the ghi ^ ^---spaces or when we remove the word abc def ghi ^^---spaces You need something like abc def( the)? ghi to also

Java support for conditional lookahead

阅读更多关于 Java support for conditional lookahead

问题 In the following let's say zip codes I am trying to exclude the 33333- from the result. I do: String zip = "11111 22222 33333- 44444-4444"; String regex = "\\d{5}(?(?=-)-\\d{4})"; Pattern pattern = Pattern.compile(regex); Matcher matcher = pattern.matcher(zip); while (matcher.find()) { System.out.println(" Found: " + matcher.group()); } Expect to get: Found: 11111 Found: 22222 Found: 44444-4444 I am trying to enforce format of: 5 digits optionally followed by a - and 4 digits. 5 digits with

Consequences of Inserting Positive Lookbehind into Arbitrary Regex to Simulate Byte Offset

阅读更多关于 Consequences of Inserting Positive Lookbehind into Arbitrary Regex to Simulate Byte Offset

问题 What would be the consequences of inserting a positive lookbehind for n-bytes, (?<=\C{n}) , into the beginning of any arbitrary regular expression, particularly when used for replacement operations? At least within PHP, the regex match functions, preg_match and preg_match_all , allow for matching to begin after a given byte offset. There is no corresponding feature in any of the other PCRE PHP functions - you can specify a limit to the number of replacements done by preg_replace for instance,

Using regex lookahead, egrep

阅读更多关于 Using regex lookahead, egrep

问题 If your file contains apples are good apple cider is also good Why would egrep '(?=apples)app' file fail to pick up any lines? Using egrep 2.5.1 on MAC 回答1: Extended regular expression doesn't have positive look-ahead feature. See the regex flavor comparison 来源： https://stackoverflow.com/questions/10645542/using-regex-lookahead-egrep

Capture stream of digits which is not followed by certain digits

阅读更多关于 Capture stream of digits which is not followed by certain digits

问题 I wanted to capture a stream of digits which are not followed by certain digits. For example input = abcdef lookbehind 123456..... asjdnasdh lookbehind 789432 I want to capture 789432 and not 123 using negative lookahead only . I tried (?<=lookbehind )([\d])+(?!456) but it captures 123456 and 789432 . Using (?<=lookbehind )([\d])+?(?!456) captures only 1 and 7 . Grouping is not an option for me as my use case doesn't allow me to do it. Is there any way I can capture 789432 and not 123 using

RegEx find all XML tags

阅读更多关于 RegEx find all XML tags

问题 How do I match all the beginning tags in an XML document with RegEx? I just need to collect the tag names used. This is what I have: (?<=<)(.*?)((?= \/>)|(?=>)) this matches all the beginning and closing tags. Example: <Habazutty>yaddayadda</Habazutty> <Vogons /> <Targ>blahblah</Targ> Above code matches: Habazutty /Habazutty Vogons Targ /Targ I only need Habazutty Vogons Targ I couldn't figure out a way to exclude the closing tags. Negative lookahead didn't work - found nothing. I must have

regular expression matching a string that is followed with another string without capturing the latter

阅读更多关于 regular expression matching a string that is followed with another string without capturing the latter

问题 Is there an ability to make a lookahead assertion non-capturing? Things like bar(?:!foo) and bar(?!:foo) do not work (Python). 回答1: If you do bar(?=ber) on "barber", "bar" is matched, but "ber" is not captured. 回答2: You didn't respond to Alan's question, but I'll assume that he's correct and you're interested in a negative lookahead assertion. IOW - match 'bar' but NOT 'barfoo'. In that case, you can construct your regex as follows: myregex = re.compile('bar(?!foo)') for example, from the