How to get what is between the quotes in the following two texts ?
text_1 = r\"\"\" \"Some text on \\\"two\\\" lines with a backslash escaped\\\\\" \\
+
>>> import re
>>> text_1 = r""" "Some text on \"two\" lines with a backslash escaped\\" \
+ "Another text on \"three\" lines" """
>>> text_2 = r""" "Some text on \"two\" lines with a backslash escaped\\" + "Another text on \"three\" lines" """
>>> re.findall(r'\\"([^"]+)\\"', text_2)
['two', 'three']
>>> re.findall(r'\\"([^"]+)\\"', text_1)
['two', 'three']
Perhaps you want this:
re.findall(r'\\"((?:(?<!\\)[^"])+)\\"', text)
>>> import re
>>> text = "Some text on\n\"two\"lines" + "Another texton\n\"three\"\nlines"
>>> re.findall(r'"(.*)"', text)
["two", "three"]
Match everything but a double quote:
import re
text = "Some text on \"two\" lines" + "Another text on \"three\" lines"
print re.findall(r'"([^"]*)"', text)
Output
['two', 'three']
"(?:\\.|[^"\\])*"
matches a quoted string, including any escaped characters that occur within it.
Explanation:
" # Match a quote.
(?: # Either match...
\\. # an escaped character
| # or
[^"\\] # any character except quote or backslash.
)* # Repeat any number of times.
" # Match another quote.