问题
I have an S19 file looking something like below:
S0030000FC
S30D0003C0000F0000000000000020
S3FD00000000782EFF1FB58E00003D2B00003D2B00003D2B00003D2B00003D2B00003D
S3ED000000F83D2B00003D2B00003D2B00003D2B00003D2B00003D2B00003D2B00003D
S31500000400FFFFFFFFFFFFFFFFFFFFFFFF7EF9FFFF7D
S3FD0000041010B5DFF828000468012147F22C10C4F20300016047F22010C4F2030000
S70500008EB4B8
I want to separate the first two characters and also the next two characters, and so on... I want it to look like below (last two characters are also to be separated for each line):
S0, 03, 0000, FC
S3, 0D, 0003C000, 0F00000000000000, 20
S3, FD, 00000000, 782EFF1FB58E00003D2B00003D2B00003D2B00003D2B00003D2B0000, 3D
S3, ED, 000000F8, 3D2B00003D2B00003D2B00003D2B00003D2B00003D2B00003D2B0000, 3D
S3, 15, 00000400, FFFFFFFFFFFFFFFFFFFFFFFF7EF9FFFF, 7D
S3, FD, 00000410, 10B5DFF828000468012147F22C10C4F20300016047F22010C4F20300, 00
S7, 05, 00008EB4, B8
How can I do this in Python? I have something like this:
#!/usr/bin/python
import string,os,sys,re,fileinput
print "hi"
inputfile = "k60.S19"
outputfile = "k60_out.S19"
# open the source file and read it
fh = file(inputfile, 'r')
subject = fh.read()
fh.close()
# create the pattern object. Note the "r". In case you're unfamiliar with Python
# this is to set the string as raw so we don't have to escape our escape characters
pattern2 = re.compile(r'S3')
pattern3 = re.compile(r'S7')
pattern1 = re.compile(r'S0')
# do the replace
result1 = pattern1.sub("S0, ", subject)
result2 = pattern2.sub("S3, ", subject)
result3 = pattern3.sub("S7, ", subject)
# write the file
f_out = file(outputfile, 'w')
f_out.write(result1)
f_out.write(result2)
f_out.write(result3)
f_out.close()
#EoF
but it is not working as I like!! Can someone help me with how to come up with proper regular expression use for this?
回答1:
try package bincopy, maybe you need it.
bincopy - Interpret strings as packed binary data
Mangling of various file formats that conveys binary information (Motorola S-Record, Intel HEX and binary files).
import bincopy
f = bincopy.BinFile()
f.add_srec_file("path/to/your/s19/flie.s19")
f.as_binary() # print s19 as binary
or you can easily use open() for a file:
with open("path/to/your/s19/flie.s19") as s19:
for line in s19:
type = line[0:2]
count = line[2:4]
adress = line[4:12]
data = line[12:-2]
crc = line[-2:]
print type + ", "+ count + ", " + adress + ", " + data + ", " + crc + "\n"
hope it helps. Motorola S-record file format
回答2:
You can do it using a callback function as replacement with re.sub
:
#!/usr/bin/python
import re
data = r'''S0030000FC
S30D0003C0000F0000000000000020
S3FD00000000782EFF1FB58E00003D2B00003D2B00003D2B00003D2B00003D2B00003D
S3ED000000F83D2B00003D2B00003D2B00003D2B00003D2B00003D2B00003D2B00003D
S31500000400FFFFFFFFFFFFFFFFFFFFFFFF7EF9FFFF7D
S3FD0000041010B5DFF828000468012147F22C10C4F20300016047F22010C4F2030000
S70500008EB4B8'''
pattern = re.compile(r'^(..)(..)((?:.{4}){1,2})(.*)(?=..)', re.M)
def repl(m):
repstr = ''
for g in m.groups():
if (g):
repstr += g + ', '
return repstr
print re.sub(pattern, repl, data)
However, as Mark Setchell notices it, there is probably a nice way to do it with slicing.
回答3:
I know you are thinking Python and regexes, but this was made for awk
and the following will maybe help you work out the way to do it using slicing:
awk '{r=length($0);print substr($0,1,2),substr($0,3,2),substr($0,5,8),substr($0,13,r-14),substr($0,r-1)}' OFS=, k60.s19
That says "get the length of the line in variable r
, then print the first two characters, the next two characters, the next 8 characters and so on... and use a comma as the field separator".
EDITED
Here are a few more hints to get you started...
if you want to avoid printing line 1, you can do
awk 'FNR==1{next} ...rest of awk script above ... '
If you want to only process lines longer than 40 characters, you can do
awk 'length($0)>40 {print}' yourfile
If you only want to process lines where the second field is "xx", you can do
awk '$2 ~ "xx" {print}' yourfile
来源:https://stackoverflow.com/questions/24110121/how-to-manipulate-srec-file