问题
I am trying to parse a string which is of the following format:
text="some random string <inAngle> <anotherInAngle> [-option text] [-anotherOption <text>] [-option (Y|N)]"
I want to split the string in three parts.
- Just the "some random string"
- Everything that is ONLY in angle brackets. I.E inAngle and anotherInAngle above.
- Everything that is in square brackets.
If I use the RegEx
re.findall(r'\[(.+?)\]', text)
It gives everything I need within square brackets. If I use the same RegEx with angle brackets however,
re.findall(r'<(.+?)>', text)
It gives the text which is within angle bracket that are within square brackets too. So for example "text" from above which is within [-anotherOption]. I do not want that. The RegEx for angle bracket match should only return "inAngle" "anotherInAngle" from above. What would be the RegEx for it?
Also how do I get only the first part i.e "some random string". This string can have 2 or 3 number of words
回答1:
You can simply disregard everything between square brackets before searching for things in angle brackets:
interm = re.sub(r'\[(.*?)\]', '', text)
re.findall(r'<(.+?)>', interm)
outputs
['inAngle', 'anotherInAngle']
then for matching the first part, match everything up to [ or <. Granted this wont work if a string is allowed to randomly have either of these symbols unclosed embedded in the first part:
re.findall(r'([^<\[]+)', text)[0]
outputs
some random string
回答2:
Try if this regex would capture what you need
\s*([^><[\]]+\b)|\[([^]]*)]|<([^>]*)>
\s*preceded by optional whitespace([^><[\]]+\b)Group 1: Any non brackets until \b (remove if undesired)|\[([^]]*)]or Group 2: What's inside square brackets|<([^>]*)>or Group 3: What's inside angle brackets
See demo at regex101 (use "code generator" if needed)
回答3:
<(.+?)>(?![^\[]*\])|\[(.+?)\]|((?!\s+)[^\[\]<>]+)
You can simply use this re.findall.See demo.
https://regex101.com/r/hE4jH0/10
来源:https://stackoverflow.com/questions/33747372/python-regex-for-exact-matches-of-brackets