Exclusive Or in Regular Expression

◇◆丶佛笑我妖孽 提交于 2019-11-26 20:21:29

问题


Looking for a bit of regex help. I'd like to design an expression that matches a string with "foo" OR "bar", but not both "foo" AND "bar"

If I do something like...

/((foo)|(bar))/

It'll match "foobar". Not what I'm looking for. So, how can I make regex match only when one term or the other is present?

Thanks!


回答1:


You can do this with a single regex but I suggest for the sake of readability you do something like...

(/foo/ and not /bar/) || (/bar/ and not /foo/)



回答2:


This is what I use:

/^(foo|bar){1}$/

See: http://www.regular-expressions.info/quickstart.html under repetition




回答3:


If your regex language supports it, use negative lookaround:

(?<!foo|bar)(foo|bar)(?!foo|bar)

This will match "foo" or "bar" that is not immediately preceded or followed by "foo" or "bar", which I think is what you wanted.

It's not clear from your question or examples if the string you're trying to match can contain other tokens: "foocuzbar". If so, this pattern won't work.

Here are the results of your test cases ("true" means the pattern was found in the input):

foo: true
bar: true
foofoo: false
barfoo: false
foobarfoo: false
barbar: false
barfoofoo: false



回答4:


This will take 'foo' and 'bar' but not 'foobar' and not 'blafoo' and not 'blabar':

/^(foo|bar)$/

^ = mark start of string (or line)
$ = mark end of string (or line)

This will take 'foo' and 'bar' and 'foo bar' and 'bar-foo' but not 'foobar' and not 'blafoo' and not 'blabar':

/\b(foo|bar)\b/

\b = mark word boundry



回答5:


You haven't specified behaviour regarding content other than "foo" and "bar" or repetitions of one in the absence of the other. e.g., Should "food" or "barbarian" match?

Assuming that you want to match strings which contain only one instance of either "foo" or "bar", but not both and not multiple instances of the same one, without regard for anything else in the string (i.e., "food" matches and "barbarian" does not match), then you could use a regex which returns the number of matches found and only consider it successful if exactly one match is found. e.g., in Perl:

@matches = ($value =~ /(foo|bar)/g)  # @matches now hold all foos or bars present
if (scalar @matches == 1) {          # exactly one match found
  ...
}

If multiple repetitions of that same target are allowed (i.e., "barbarian" matches), then this same general approach could be used by then walking the list of matches to see whether the matches are all repeats of the same text or if the other option is also present.




回答6:


You might want to consider the ? conditional test.

(?(?=regex)then|else)

Regular Expression Conditionals




回答7:


If you want a true exclusive or, I'd just do that in code instead of in the regex. In Perl:

/foo/ xor /bar/

But your comment:

Matches: "foo", "bar" nonmatches: "foofoo" "barfoo" "foobarfoo" "barbar" "barfoofoo"

indicates that you're not really looking for exclusive or. You actually mean "Does /foo|bar/ match exactly once?"

my $matches = 0;
while (/foo|bar/g) {
  last if ++$matches > 1;
}

my $ok = ($matches == 1)



回答8:


I know this is a late entry, but just to help others who may be looking:

(/b(?:(?:(?!foo)bar)|(?:(?!bar)foo))/b)



回答9:


I'd use something like this. It just checks for space around the words, but you could use the \b or \B to check for a border if you use \w. This would match " foo " or " bar ", so obviously you'd have to replace the whitespace as well, just in case. (Assuming you're replacing anything.)

/\s((foo)|(bar))\s/



回答10:


I don't think this can be done with a single regular expression. And boundaries may or may not work depending on what you're matching against.

I would match against each regex separately, and do an XOR on the results.

foo = re.search("foo", str) != None
bar = re.search("bar", str) != None
if foo ^ bar:
    # do someting...



回答11:


I tried with Regex Coach against:

x foo y
x bar y
x foobar y

If I check the g option, indeed it matches all three words, because it searches again after each match.
If you don't want this behavior, you can anchor the expression, for example matching only on word boundaries:

\b(foo|bar)\b

Giving more context on the problem (what the data looks like) might give better answers.




回答12:


\b(foo)\b|\b(bar)\b

And use only the first capture group.




回答13:


Using the word boundaries, you can get the single word...

me@home ~  
$ echo "Where is my bar of soap?" | egrep "\bfoo\b|\bbar\b"  
Where is my bar of soap?  

me@home ~  
$ echo "What the foo happened here?" | egrep "\bfoo\b|\bbar\b"  
What the foo happened here?  

me@home ~  
$ echo "Boy, that sure is foobar\!" | egrep "\bfoo\b|\bbar\b"  


来源:https://stackoverflow.com/questions/247167/exclusive-or-in-regular-expression

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!