I have a file that includes a bunch of strings like \"size=XXX;\". I am trying python\'s re module for the first time and am a bit mystified by the following behavior: if I
When a regular expression contains parentheses, they capture their contents to groups, changing the behaviour of findall() to only return those groups. Here's the relevant section from the docs:
(...)Matches whatever regular expression is inside the parentheses, and indicates the start and end of a group; the contents of a group can be retrieved after a match has been performed, and can be matched later in the string with the
\numberspecial sequence, described below. To match the literals'('or')', use\(or\), or enclose them inside a character class:[(] [)].
To avoid this behaviour, you can use a non-capturing group:
>>> print re.findall(r'size=(?:50|51);',myfile)
['size=51;', 'size=51;', 'size=51;', 'size=50;', 'size=50;', 'size=50;', 'size=50;']
Again, from the docs:
(?:...)A non-capturing version of regular parentheses. Matches whatever regular expression is inside the parentheses, but the substring matched by the group cannot be retrieved after performing a match or referenced later in the pattern.