问题
I have this regex pattern which I made myself (I'm a noob though, and made it through following tutorials):
^([a-z0-9\p{Greek}].*)\s(Ε[0-9\p{Greek}]+|Θ)\s[\(]([a-z1-9\p{Greek}]+.*)[\)]\s-\s([a-z0-9\p{Greek}]+$)
And I'm trying to match the following sentences:
ΠΡΟΓΡΑΜΜΑΤΙΣΤΙΚΕΣ ΕΦΑΡΜ ΣΤΟ ΔΙΑΔΙΚΤΥΟ Ε2 (Ε.Β.Δ.) - ΔΗΜΗΤΡΙΟΥ
ΠΡΟΓΡΑΜΜΑΤΙΣΜΟΣ 1 Θ (ΑΜΦ) - ΜΑΣΤΟΡΟΚΩΣΤΑΣ
ΕΙΣΑΓΩΓΗ ΣΤΗΝ ΠΛΗΡΟΦΟΡΙΚΗ Θ (ΑΜΦ) - ΒΟΛΟΓΙΑΝΝΙΔΗΣ
And so on.
This pattern splits the string into 4 parts.
For example, for the string:
ΠΡΟΓΡΑΜΜΑΤΙΣΤΙΚΕΣ ΕΦΑΡΜ ΣΤΟ ΔΙΑΔΙΚΤΥΟ Ε2 (Ε.Β.Δ.) - ΔΗΜΗΤΡΙΟΥ
The first match is: ΠΡΟΓΡΑΜΜΑΤΙΣΤΙΚΕΣ ΕΦΑΡΜ ΣΤΟ ΔΙΑΔΙΚΤΥΟ (Subject's Name)
Second match is: Ε2 (Class)
Third match is: Ε.Β.Δ. (Room)
And the forth match is: ΔΗΜΗΤΡΙΟΥ (Teacher)
Now in some entries
E*/Θ is not defined, and I want to get the 3 matches without the E*/Θ. How should I modify my pattern so that (Ε[0-9\p{Greek}]+|Θ) is an optional match?
I tried ? so far, but because in my previous matches i'm defining \s and \s it requires 2 whitespaces to get 3 matches and i only have one in my string.
回答1:
I think you need to do two things:
- Make
.*lazy (i.e..*?) - Enclose
(?:\s(Ε[0-9\p{Greek}]+|Θ))?with a non-capturing optional group.
The regex will look like
^([a-z0-9\p{Greek}].*?)(?:\s(Ε[0-9\p{Greek}]+|Θ))?\s[\(]([a-z1-9\p{Greek}]+.*)[\)]\s-\s([a-z0-9\p{Greek}]+)$
^^ ^^ ^
See demo
If you do not make the first .* lazy, it will eat up the second group that is optional. Making it lazy will ensure that if there is some text that can be matched by the second capturing group, it will be "set".
Note you call capture groups matches, which is wrong. Matches are whole texts matched by the entire regular expression and captures are just substrings matched by parts of regexp enclosed in unescaped round brackets. See more on capture groups at regular-expressions.info.
回答2:
You can use something like:
(E[0-9\p{Greek}]+|0)?
The whole group will be optional (?).
来源:https://stackoverflow.com/questions/33486995/regex-optional-match