regex-greedy

How to tell a RegEx to be greedy on an 'Or' Expression

北城以北 提交于 2021-02-17 06:23:12
问题 Text: [A]I'm an example text [] But I want to be included [[]] [A]I'm another text without a second part [] Regex: \[A\][\s\S]*?(?:(?=\[\])|(?=\[\[\]\])) Using the above regex, it's not possible to capture the second part of the first text. Demo Is there a way to tell the regex to be greedy on the 'or'-part? I want to capture the biggest group possible. Edit 1: Original Attempt: Demo Edit 2: What I want to achive: In our company, we're using a webservice to report our workingtime. I want to

Bash regex ungreedy match

和自甴很熟 提交于 2021-02-04 22:16:15
问题 I have a regex pattern that is supposed to match at multiple places in a string. I want to get all the match groups into one array and then print every element. So, I've been trying this: #!/bin/bash f=$'\n\tShare1 Disk\n\tShare2 Disk\n\tPrnt1 Printer' regex=$'\n\t(.+?)\\s+Disk' if [[ $f =~ $regex ]] then for match in "${BASH_REMATCH[@]}" do echo "New match: $match" done else echo "No matches" fi Result: New match: Share1 Disk Share2 Disk New match: Share1 Disk Share2 The expected result

Regex with prefix and optional suffix

折月煮酒 提交于 2021-02-04 06:19:47
问题 This is maybe the 100+1 question regarding regex optional suffixes on SO, but I didn't find any, that could help me :( I need to extract a part of string from the common pattern: prefix/s/o/m/e/t/h/i/n/g/suffix using a regular expression. The prefix is constant and the suffix may not appear at all, so prefix/(.+)/suffix doesn't meet my requirements. Pattern prefix/(.+)(?:/suffix)? returns s/o/m/e/t/h/i/n/g/suffix . The part (?:/suffix)? must be somehow more greedy. I want to get s/o/m/e/t/h/i

Regex with prefix and optional suffix

若如初见. 提交于 2021-02-04 06:18:25
问题 This is maybe the 100+1 question regarding regex optional suffixes on SO, but I didn't find any, that could help me :( I need to extract a part of string from the common pattern: prefix/s/o/m/e/t/h/i/n/g/suffix using a regular expression. The prefix is constant and the suffix may not appear at all, so prefix/(.+)/suffix doesn't meet my requirements. Pattern prefix/(.+)(?:/suffix)? returns s/o/m/e/t/h/i/n/g/suffix . The part (?:/suffix)? must be somehow more greedy. I want to get s/o/m/e/t/h/i

is there a method of rule based matching of spacy to match patterns?

霸气de小男生 提交于 2021-01-28 11:17:32
问题 i want to use rule based matching i have a text like each word with POS: text1= "it_PRON is_AUX a_DET beautiful_ADJ apple_NOUN" text2= "it_PRON is_AUX a_DET beautiful_ADJ and_CCONJ big_ADJ apple_NOUN" so i want to create a rule based matching that extract if we have an ADJ followed by noun (NOUN) or an ADJ followed by (PUNCT or CCONJ) followed by an ADJ followed by a noun (NOUN) so, iwant to have in output : text1 = [beautiful_ADJ apple_NOUN] text2= [beautiful_ADJ and_CCONJ big_ADJ apple_NOUN

Regex Recursion: Nth Subpatterns

邮差的信 提交于 2021-01-27 13:44:07
问题 I'm trying to learn about Recursion in Regular Expressions, and have a basic understanding of the concepts in the PCRE flavour. I want to break a string: Geese (Flock) Dogs (Pack) into: Full Match: Geese (Flock) Dogs (Pack) Group 1: Geese (Flock) Group 2: Geese Group 3: (Flock) Group 4: Dogs (Pack) Group 5: Dogs Group 6: (Pack) I know neither regex quite does this, but I was more curious as to the reason why the the first pattern works, but the second one doesn't. Pattern 1: ((.*?)(\(\w{1,}\)

Regex to find last occurrence of pattern in a string

和自甴很熟 提交于 2020-12-05 12:12:02
问题 My string being of the form: "as.asd.sd fdsfs. dfsd d.sdfsd. sdfsdf sd .COM" I only want to match against the last segment of whitespace before the last period(.) So far I am able to capture whitespace but not the very last occurrence using: \s+(?=\.\w) How can I make it less greedy? 回答1: You can try like so: (\s+)(?=\.[^.]+$) (?=\.[^.]+$) Positive look ahead for a dot and characters except dot at the end of line. Demo: https://regex101.com/r/k9VwC6/3 回答2: "as.asd.sd ffindMyLastOccurrencedsfs

Regex to find last occurrence of pattern in a string

北城余情 提交于 2020-12-05 12:11:20
问题 My string being of the form: "as.asd.sd fdsfs. dfsd d.sdfsd. sdfsdf sd .COM" I only want to match against the last segment of whitespace before the last period(.) So far I am able to capture whitespace but not the very last occurrence using: \s+(?=\.\w) How can I make it less greedy? 回答1: You can try like so: (\s+)(?=\.[^.]+$) (?=\.[^.]+$) Positive look ahead for a dot and characters except dot at the end of line. Demo: https://regex101.com/r/k9VwC6/3 回答2: "as.asd.sd ffindMyLastOccurrencedsfs

Greedy behaviour of grep

回眸只為那壹抹淺笑 提交于 2020-06-27 10:27:11
问题 I thought that in regular expressions, the "greediness" applies to quantifiers rather than matches as a whole. However, I observe that grep -E --color=auto 'a+(ab)?' <(printf "aab") returns aab rather than aa b. The same applies to sed. On the other hand, in pcregrep and other tools, it is really the quantifier that is greedy. Is this a specific behaviour of grep? N.B. I checked both grep (BSD grep) 2.5.1-FreeBSD and grep (GNU grep) 3.1 回答1: In the description of term matched, POSIX states

Multiline python regex

天涯浪子 提交于 2020-05-14 19:09:57
问题 I have a file structured like this : A: some text B: more text even more text on several lines A: and we start again B: more text more multiline text I'm trying to find the regex that will split my file like this : >>>re.findall(regex,f.read()) [('some text','more text','even more text\non several lines'), ('and we start again','more text', 'more\nmultiline text')] So far, I've ended up with the following : >>>re.findall('A:(.*?)\nB:(.*?)\n(.*?)',f.read(),re.DOTALL) [(' some text', ' more