regex-lookarounds

RegEx for capturing a repeating pattern

血红的双手。 提交于 2019-12-11 05:22:34
问题 I have the following regex from regex capturing with repeating pattern ([0-9]{1,2}h)[ ]*([0-9]{1,2}min):[ ]*(.*(?:\n(?![0-9]{1,2}h).*)*) It takes the following string 1h 30min: Title - Description Line 1 1h 30min: Title - Description Line 1 - Description Line 2 - Description Line 3 And produces this as a result Match 1: "1h 30min: Title - Description Line 1" Group 1: "1h" Group 2: "30min" Group 3: "Title - Description Line 1" Match 2: "1h 30min: Title - Description Line 1 - Description Line 2

Regex to match if given text is not found and match as little as possible

时间秒杀一切 提交于 2019-12-11 05:13:52
问题 I have text: <a> sdfsdf <b>DDzz</b> sdfsdf </a> <a> sdfsdf <b>DDzz</b> sdfsdf </a> <a> sdfsdf <b>BBzz</b> sdfsdf </a> <a> sdfsdf <b>DDzz</b> sdfsdf </a> I can't parse it as xml. I need to use regex here. Also this is only example. I want regex that can match every group <a>...</a> that does not contain element b with text that starts with BB . I came up with this regex: <a>.*?<b>(?!B).*?</b>.*?</a> But it matches last group as: <a> sdfsdf <b>BBzz</b> sdfsdf </a> <a> sdfsdf <b>DDzz</b> sdfsdf

Regex to extract cents value from arbitrary currency formatting

强颜欢笑 提交于 2019-12-11 01:12:50
问题 I need to extract cents value from next possible values using Java regex (thousand separator could be both dot and comma): $123,456.78 123,456.78 dollars 123,456.78 I have partially working solution: [\.,]\d\d\D The problem with my solution, that it doesn't work in case "123,456.78" when the last digit is the end of string. How can I handle this case? http://java-regex-tester.appspot.com/regex/6af08221-63cb-4c5b-a865-c86fe5e825ff 回答1: Note that \D requires a character that is not a digit

How to write optional word in Regular Expression?

自古美人都是妖i 提交于 2019-12-10 16:27:07
问题 I want to write a java Regular expression that recognises the following patterns. abc def the ghi and abc def ghi I tried this: abc def (the)? ghi But, it is not recognizing the second pattern.Where am I going wrong? 回答1: abc def (the )?ghi ^^ Remove the extra space 回答2: Spaces are also valid characters in regex, so abc def (the)? ghi ^ ^ --- spaces can match only abc def the ghi ^ ^---spaces or when we remove the word abc def ghi ^^---spaces You need something like abc def( the)? ghi to also

Java support for conditional lookahead

≯℡__Kan透↙ 提交于 2019-12-10 16:24:49
问题 In the following let's say zip codes I am trying to exclude the 33333- from the result. I do: String zip = "11111 22222 33333- 44444-4444"; String regex = "\\d{5}(?(?=-)-\\d{4})"; Pattern pattern = Pattern.compile(regex); Matcher matcher = pattern.matcher(zip); while (matcher.find()) { System.out.println(" Found: " + matcher.group()); } Expect to get: Found: 11111 Found: 22222 Found: 44444-4444 I am trying to enforce format of: 5 digits optionally followed by a - and 4 digits. 5 digits with

Consequences of Inserting Positive Lookbehind into Arbitrary Regex to Simulate Byte Offset

别来无恙 提交于 2019-12-10 12:39:19
问题 What would be the consequences of inserting a positive lookbehind for n-bytes, (?<=\C{n}) , into the beginning of any arbitrary regular expression, particularly when used for replacement operations? At least within PHP, the regex match functions, preg_match and preg_match_all , allow for matching to begin after a given byte offset. There is no corresponding feature in any of the other PCRE PHP functions - you can specify a limit to the number of replacements done by preg_replace for instance,

Using regex lookahead, egrep

杀马特。学长 韩版系。学妹 提交于 2019-12-10 12:29:04
问题 If your file contains apples are good apple cider is also good Why would egrep '(?=apples)app' file fail to pick up any lines? Using egrep 2.5.1 on MAC 回答1: Extended regular expression doesn't have positive look-ahead feature. See the regex flavor comparison 来源: https://stackoverflow.com/questions/10645542/using-regex-lookahead-egrep

Capture stream of digits which is not followed by certain digits

北城以北 提交于 2019-12-10 11:37:36
问题 I wanted to capture a stream of digits which are not followed by certain digits. For example input = abcdef lookbehind 123456..... asjdnasdh lookbehind 789432 I want to capture 789432 and not 123 using negative lookahead only . I tried (?<=lookbehind )([\d])+(?!456) but it captures 123456 and 789432 . Using (?<=lookbehind )([\d])+?(?!456) captures only 1 and 7 . Grouping is not an option for me as my use case doesn't allow me to do it. Is there any way I can capture 789432 and not 123 using

RegEx find all XML tags

让人想犯罪 __ 提交于 2019-12-10 11:32:47
问题 How do I match all the beginning tags in an XML document with RegEx? I just need to collect the tag names used. This is what I have: (?<=<)(.*?)((?= \/>)|(?=>)) this matches all the beginning and closing tags. Example: <Habazutty>yaddayadda</Habazutty> <Vogons /> <Targ>blahblah</Targ> Above code matches: Habazutty /Habazutty Vogons Targ /Targ I only need Habazutty Vogons Targ I couldn't figure out a way to exclude the closing tags. Negative lookahead didn't work - found nothing. I must have

regular expression matching a string that is followed with another string without capturing the latter

跟風遠走 提交于 2019-12-10 05:38:33
问题 Is there an ability to make a lookahead assertion non-capturing? Things like bar(?:!foo) and bar(?!:foo) do not work (Python). 回答1: If you do bar(?=ber) on "barber", "bar" is matched, but "ber" is not captured. 回答2: You didn't respond to Alan's question, but I'll assume that he's correct and you're interested in a negative lookahead assertion. IOW - match 'bar' but NOT 'barfoo'. In that case, you can construct your regex as follows: myregex = re.compile('bar(?!foo)') for example, from the