Python regex search for hexadecimal bytes

陌路散爱 提交于 2019-12-08 02:20:41

问题


I'm trying to search a binary file for a series of hexadecimal values, however, I've run into a few issues that I can't quite solve. (1) I'm not sure how to search the entire file and return all the matches. Currently I have f.seek going only as far as I think the value might be, which is no good. (2) I'd like to return the offset in either decimal or hex where there might be a match, although I get 0 each time, so I'm not sure what I did wrong.

example.bin

AA BB CC DD EE FF AB AC AD AE AF BA BB BC BD BE
BF CA CB CC CD CE CF DA DB DC DD DE DF EA EB EC

code:

# coding: utf-8
import struct
import re

with open("example.bin", "rb") as f:
    f.seek(30)
    num, = struct.unpack(">H", f.read(2))
hexaPattern = re.compile(r'(0xebec)?')
m = re.search(hexaPattern, hex(num))
if m:
   print "found a match:", m.group(1)
   print " match offset:", m.start()

Maybe there's a better way to do all this?


回答1:


  1. I'm not sure how to search the entire file and return all the matches.
  2. I'd like to return the offset in either decimal or hex
import re

f = open('data.txt', 'wb')
f.write('\xAA\xBB\xEB\xEC')
f.write('\xAA\xBB\xEB\xEC')
f.write('\xAA\xBB\xEB\xEC')
f.write('\xAA\xBB\xEB\xEC')
f.write('\xAA\xBB\xEB\xEC')
f.write('\xAA\xBB\xEB\xEC')
f.write('\xAA\xBB\xEB\xEC')
f.close()

f = open('data.txt', 'rb')
data = f.read()
f.close()

pattern = "\xEB\xEC"
regex = re.compile(pattern)

for match_obj in regex.finditer(data):
    offset = match_obj.start()
    print "decimal: {}".format(offset)
    print "hex(): " + hex(offset)
    print 'formatted hex: {:02X} \n'.format(offset)

--output:--
decimal: 2
hex(): 0x2
formatted hex: 02 

decimal: 6
hex(): 0x6
formatted hex: 06 

decimal: 10
hex(): 0xa
formatted hex: 0A 

decimal: 14
hex(): 0xe
formatted hex: 0E 

decimal: 18
hex(): 0x12
formatted hex: 12 

decimal: 22
hex(): 0x16
formatted hex: 16 

decimal: 26
hex(): 0x1a
formatted hex: 1A 

The positions in the file use 0 based indexing like a list.

e.finditer(pattern, string, flags=0)
Return an iterator yielding MatchObject instances over all non-overlapping matches for the RE pattern in string. The string is scanned left-to-right, and matches are returned in the order found.

Match objects support the following methods and attributes:
start([group])
end([group])
Return the indices of the start and end of the substring matched by group; group defaults to zero (meaning the whole matched substring).

https://docs.python.org/2/library/re.html




回答2:


try

import re

with open("example.bin", "rb") as f:
    f1 = re.search(b'\xEB\xEC', f.read())

print "found a match:", f1 .group()
print " match offset:", f1 .start()


来源:https://stackoverflow.com/questions/27697218/python-regex-search-for-hexadecimal-bytes

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!