问题
My regex-fu is weak today. I'm trying to capture groups in a string into 5 parts with the format:
substring delimiter substring number(space) substring
I've tried using word boundaries but no success. I've resorted to using *.(greedy and lazy, I know) which is bit better than not working at all
Here's what I have:
import re
s = "FOREVER - Alabaster Cuttlefish - 01 This Style Is Cheese"
m = re.compile("(.*)(\s-\s)(\d{1,3}\s)(.*)")
g = m.match(s)
if g:
print m.match(s).group(1) # FOREVER
print m.match(s).group(2) # -
print m.match(s).group(3) # Alabaster Cuttlefish
print m.match(s).group(4) # 01
# fail
# print m.match(s).group(5) # This Style Is Cheese
Group 5 doesn't exist because it gets captures in the first group. hence my quandary.
回答1:
You are very close. Replace the regular expression with:
m = re.compile("(.*?)(\s-\s)([^\d]*)(\d{1,3}\s)(.*)")
If you don't want the trailing dash at the end of Alabaster Cuttlefish, use:
import re
s = "FOREVER - Alabaster Cuttlefish - 01 This Style Is Cheese"
m = re.compile("(.*)(\s-\s)(.*)(\s-\s)(\d{1,3}\s)(.*)")
g = m.search(s)
if g:
print g.group(1) # FOREVER
print g.group(2) # -
print g.group(3) # Alabaster Cuttlefish
print g.group(5) # 01
print g.group(6) # This Style Is Cheese
来源:https://stackoverflow.com/questions/38933353/regex-split-into-groups