I would like to create a regular expression that matches A
, B
, and AB
, where A
and B
are quite complex regul
In general, it is not possible. You may use some workarounds though.
A
and B
start and end with word charactersIn case the A
and B
are or start/end in word type characters (letters, digits or _
, you may use
(?<!\w)A?(?:B)?(?!\w)(?<!\W(?!\w))(?<!^(?!\w))
See the regex demo
(?<!\w)
- no word character allowed beforeA?
- an optional A
(?:B)?
- an optional B
(?!\w)
- no word char is allowed right after (at this point, we may match empty strings between start of string and a non-word char, between a non-word and end of string or between two non-word chars, hence we add...)(?<!\W(?!\w))
- no match allowed if right before is a non-word char that is not followed with a word char (this cancels empty matches between two non-word chars and a non-word char and end of string)(?<!^(?!\w))
- no match allowed at the start of string if not followed with a word char.In PCRE, you may avoid repeating the same pattern part since you may recurse subpatterns with subroutine calls:
A(?<BGroup>B)?|(?&BGroup)
See the regex demo.
The (?<BGroup>B)
is a named capturing group whose pattern is repeated with the (?&BGroup)
named subroutine call.
See Recursive patterns.
I would go for storing A and B into variables and create the pattern (AB?|B) from A and B by concatenation. This has the advantage of enhancing readability as you can document the subpatterns A, and B.