regex-greedy | 易学教程

How to tell a RegEx to be greedy on an 'Or' Expression

阅读更多关于 How to tell a RegEx to be greedy on an 'Or' Expression

问题 Text: [A]I'm an example text [] But I want to be included [[]] [A]I'm another text without a second part [] Regex: \[A\][\s\S]*?(?:(?=\[\])|(?=\[\[\]\])) Using the above regex, it's not possible to capture the second part of the first text. Demo Is there a way to tell the regex to be greedy on the 'or'-part? I want to capture the biggest group possible. Edit 1: Original Attempt: Demo Edit 2: What I want to achive: In our company, we're using a webservice to report our workingtime. I want to

Bash regex ungreedy match

阅读更多关于 Bash regex ungreedy match

问题 I have a regex pattern that is supposed to match at multiple places in a string. I want to get all the match groups into one array and then print every element. So, I've been trying this: #!/bin/bash f=$'\n\tShare1 Disk\n\tShare2 Disk\n\tPrnt1 Printer' regex=$'\n\t(.+?)\\s+Disk' if [[ $f =~ $regex ]] then for match in "${BASH_REMATCH[@]}" do echo "New match: $match" done else echo "No matches" fi Result: New match: Share1 Disk Share2 Disk New match: Share1 Disk Share2 The expected result

Regex with prefix and optional suffix

阅读更多关于 Regex with prefix and optional suffix

问题 This is maybe the 100+1 question regarding regex optional suffixes on SO, but I didn't find any, that could help me :( I need to extract a part of string from the common pattern: prefix/s/o/m/e/t/h/i/n/g/suffix using a regular expression. The prefix is constant and the suffix may not appear at all, so prefix/(.+)/suffix doesn't meet my requirements. Pattern prefix/(.+)(?:/suffix)? returns s/o/m/e/t/h/i/n/g/suffix . The part (?:/suffix)? must be somehow more greedy. I want to get s/o/m/e/t/h/i

Regex with prefix and optional suffix

阅读更多关于 Regex with prefix and optional suffix

is there a method of rule based matching of spacy to match patterns?

阅读更多关于 is there a method of rule based matching of spacy to match patterns?

问题 i want to use rule based matching i have a text like each word with POS: text1= "it_PRON is_AUX a_DET beautiful_ADJ apple_NOUN" text2= "it_PRON is_AUX a_DET beautiful_ADJ and_CCONJ big_ADJ apple_NOUN" so i want to create a rule based matching that extract if we have an ADJ followed by noun (NOUN) or an ADJ followed by (PUNCT or CCONJ) followed by an ADJ followed by a noun (NOUN) so, iwant to have in output : text1 = [beautiful_ADJ apple_NOUN] text2= [beautiful_ADJ and_CCONJ big_ADJ apple_NOUN

Regex Recursion: Nth Subpatterns

阅读更多关于 Regex Recursion: Nth Subpatterns

问题 I'm trying to learn about Recursion in Regular Expressions, and have a basic understanding of the concepts in the PCRE flavour. I want to break a string: Geese (Flock) Dogs (Pack) into: Full Match: Geese (Flock) Dogs (Pack) Group 1: Geese (Flock) Group 2: Geese Group 3: (Flock) Group 4: Dogs (Pack) Group 5: Dogs Group 6: (Pack) I know neither regex quite does this, but I was more curious as to the reason why the the first pattern works, but the second one doesn't. Pattern 1: ((.*?)($\w{1,}$

Regex to find last occurrence of pattern in a string

阅读更多关于 Regex to find last occurrence of pattern in a string

问题 My string being of the form: "as.asd.sd fdsfs. dfsd d.sdfsd. sdfsdf sd .COM" I only want to match against the last segment of whitespace before the last period(.) So far I am able to capture whitespace but not the very last occurrence using: \s+(?=\.\w) How can I make it less greedy? 回答1: You can try like so: (\s+)(?=\.[^.]+$) (?=\.[^.]+$) Positive look ahead for a dot and characters except dot at the end of line. Demo: https://regex101.com/r/k9VwC6/3 回答2: "as.asd.sd ffindMyLastOccurrencedsfs

Regex to find last occurrence of pattern in a string

阅读更多关于 Regex to find last occurrence of pattern in a string

Greedy behaviour of grep

阅读更多关于 Greedy behaviour of grep

问题 I thought that in regular expressions, the "greediness" applies to quantifiers rather than matches as a whole. However, I observe that grep -E --color=auto 'a+(ab)?' <(printf "aab") returns aab rather than aa b. The same applies to sed. On the other hand, in pcregrep and other tools, it is really the quantifier that is greedy. Is this a specific behaviour of grep? N.B. I checked both grep (BSD grep) 2.5.1-FreeBSD and grep (GNU grep) 3.1 回答1: In the description of term matched, POSIX states

Multiline python regex

阅读更多关于 Multiline python regex

问题 I have a file structured like this : A: some text B: more text even more text on several lines A: and we start again B: more text more multiline text I'm trying to find the regex that will split my file like this : >>>re.findall(regex,f.read()) [('some text','more text','even more text\non several lines'), ('and we start again','more text', 'more\nmultiline text')] So far, I've ended up with the following : >>>re.findall('A:(.*?)\nB:(.*?)\n(.*?)',f.read(),re.DOTALL) [(' some text', ' more