问题
I have this file loaded in string:
// some preceding stuff
static char header_data[] = {
1,1,1,1,1,1,0,0,0,0,1,1,1,1,1,1,
1,1,1,0,1,1,0,1,1,0,1,1,0,1,1,1,
1,1,0,1,0,1,0,1,1,0,1,0,1,0,1,1,
1,0,1,1,1,0,0,1,1,0,0,1,1,1,0,1,
0,0,0,1,1,1,1,1,1,1,1,1,1,0,1,1,
1,0,0,0,1,1,0,1,1,1,1,1,0,1,1,1,
0,1,0,0,0,1,0,0,1,1,1,1,0,0,0,0,
0,1,1,0,0,0,0,0,0,1,1,1,1,1,1,0,
0,1,1,1,0,0,0,0,0,0,1,1,0,1,1,0,
0,0,0,0,1,0,0,0,1,0,0,1,0,1,0,0,
1,1,1,0,1,1,0,0,1,1,0,0,0,1,1,1,
1,1,0,1,1,1,1,1,1,1,1,0,0,0,1,1,
1,0,1,1,1,0,0,1,1,0,0,0,0,0,1,1,
1,1,0,1,0,1,0,1,1,1,1,0,0,0,0,1,
1,1,1,0,1,1,0,1,1,0,1,1,1,1,0,1,
1,1,1,1,1,1,0,0,0,0,1,1,1,1,1,1
};
I want to get only the block with ones and zeros, and then somehow process it.
I imported re, and tried:
In [11]: re.search('static char header_data(.*);', src, flags=re.M)
In [12]: re.findall('static char header_data(.*);', src, flags=re.M)
Out[12]: []
Why doesn't it match anything? How to fix this? (It's python3)
回答1:
You need to use the re.S flag, not re.M.
re.M(re.MULTILINE) controls the behavior of^and$(whether they match at the start/end of the entire string or of each line).re.S(re.DOTALL) controls the behavior of the.and is the option you need when you want to allow the dot to match newlines.
See also the documentation.
回答2:
and then somehow process it.
Here we go to get a useable list out of the file:
import re
match = re.search(r"static char header_data\[\] = {(.*?)};", src, re.DOTALL)
if match:
header_data = "".join(match.group(1).split()).split(',')
print header_data
.*? is a non-greedy match so you really will get just the value between this set of braces.
A more expicit way without DOTALL or MULTILINE would be
match = re.search(r"static char header_data\[\] = {([01,\s\r\n]*?)};", src)
回答3:
If the format of the file does not change, you might as well not resort to re but use slices. Something on these lines could be useful
>>> file_in_string
'\n// some preceding stuff\nstatic char header_data[] = {\n 1,1,1,1,1,1,0,0,0
,0,1,1,1,1,1,1,\n 1,1,1,0,1,1,0,1,1,0,1,1,0,1,1,1,\n 1,1,0,1,0,1,0,1,1,0,1
,0,1,0,1,1,\n 1,0,1,1,1,0,0,1,1,0,0,1,1,1,0,1,\n 0,0,0,1,1,1,1,1,1,1,1,1,1
,0,1,1,\n 1,0,0,0,1,1,0,1,1,1,1,1,0,1,1,1,\n 0,1,0,0,0,1,0,0,1,1,1,1,0,0,0
,0,\n 0,1,1,0,0,0,0,0,0,1,1,1,1,1,1,0,\n 0,1,1,1,0,0,0,0,0,0,1,1,0,1,1,0,\
n 0,0,0,0,1,0,0,0,1,0,0,1,0,1,0,0,\n 1,1,1,0,1,1,0,0,1,1,0,0,0,1,1,1,\n
1,1,0,1,1,1,1,1,1,1,1,0,0,0,1,1,\n 1,0,1,1,1,0,0,1,1,0,0,0,0,0,1,1,\n 1,1
,0,1,0,1,0,1,1,1,1,0,0,0,0,1,\n 1,1,1,0,1,1,0,1,1,0,1,1,1,1,0,1,\n 1,1,1,1
,1,1,0,0,0,0,1,1,1,1,1,1\n };\n'
>>> lines = file_in_string.split()
>>> lines[9:-1]
['1,1,1,1,1,1,0,0,0,0,1,1,1,1,1,1,', '1,1,1,0,1,1,0,1,1,0,1,1,0,1,1,1,', '1,1,0,
1,0,1,0,1,1,0,1,0,1,0,1,1,', '1,0,1,1,1,0,0,1,1,0,0,1,1,1,0,1,', '0,0,0,1,1,1,1,
1,1,1,1,1,1,0,1,1,', '1,0,0,0,1,1,0,1,1,1,1,1,0,1,1,1,', '0,1,0,0,0,1,0,0,1,1,1,
1,0,0,0,0,', '0,1,1,0,0,0,0,0,0,1,1,1,1,1,1,0,', '0,1,1,1,0,0,0,0,0,0,1,1,0,1,1,
0,', '0,0,0,0,1,0,0,0,1,0,0,1,0,1,0,0,', '1,1,1,0,1,1,0,0,1,1,0,0,0,1,1,1,', '1,
1,0,1,1,1,1,1,1,1,1,0,0,0,1,1,', '1,0,1,1,1,0,0,1,1,0,0,0,0,0,1,1,', '1,1,0,1,0,
1,0,1,1,1,1,0,0,0,0,1,', '1,1,1,0,1,1,0,1,1,0,1,1,1,1,0,1,', '1,1,1,1,1,1,0,0,0,
0,1,1,1,1,1,1']
来源:https://stackoverflow.com/questions/27250171/python-re-search-not-working-on-multiline-string