问题
I need to xor 2 bytes objects. I use this code:
def bxor(b1, b2): # use xor for bytes
result = b""
for b1, b2 in zip(b1, b2):
result += bytes([b1 ^ b2])
return result
It works fine when the bytes objects are small, but if I xor big objects (a few MB) it takes very long time (a few hours). How can I make it faster?
回答1:
When XORing bytes
objects with one million elements each, this loop creates roughly one million temporary bytes
objects and copies each byte, on average, roughly 500 thousand times from one temporary bytes
to the next. Note that the exact same problem exists for strings (in many other languages, too). The string solution is to create a list of string parts and use ''.join
at the end to concatenate them efficiently. You can do the same thing with bytes:
def bxor(b1, b2): # use xor for bytes
parts = []
for b1, b2 in zip(b1, b2):
parts.append(bytes([b1 ^ b2]))
return b''.join(parts)
Alternatively, you can use a bytearray
which is mutable and can therefore avoid the problem. It also allows you to not allocate a new bytes
object on every iteration, you can just append the byte/int
.
def bxor(b1, b2): # use xor for bytes
result = bytearray()
for b1, b2 in zip(b1, b2):
result.append(b1 ^ b2)
return result
You can alternatively return bytes(result)
if you want/need a bytes
object.
回答2:
Using a bytearray
is a lot faster already:
def bxor(b1, b2):
result = bytearray(b1)
for i, b in enumerate(b2):
result[i] ^= b
return bytes(result)
A quick timeit
comparison:
>>> import timeit
>>> b1, b2 = b'abcdefg' * 10, b'aaaaaaa' * 10
>>> timeit.timeit('it(b1, b2)', 'from __main__ import b1, b2, bxor as it', number=10000)
0.9230150280000089
>>> timeit.timeit('it(b1, b2)', 'from __main__ import b1, b2, bxor_ba as it', number=10000)
0.16270576599890774
This avoids creating new bytes
objects for all the concatenations.
The b''.join()
method proposed by delnan is no much better than the original version:
>>> timeit.timeit('it(b1, b2)', 'from __main__ import b1, b2, bxor_join as it', number=10000)
0.9936718749995634
And a re-run with bytestrings 100 times larger:
>>> b1, b2 = b'abcdefg' * 1000, b'aaaaaaa' * 1000
>>> timeit.timeit('it(b1, b2)', 'from __main__ import b1, b2, bxor as it', number=1000)
11.032563796999966
>>> timeit.timeit('it(b1, b2)', 'from __main__ import b1, b2, bxor_join as it', number=1000)
9.242204494001271
>>> timeit.timeit('it(b1, b2)', 'from __main__ import b1, b2, bxor_ba as it', number=1000)
1.762020197998936
to show that bytes.join()
is a faster than repeated concatenation.
A final 7 million byte run, repeated 10 times, just with the bytearray
version, I ran out of patience with the other versions:
>>> b1, b2 = b'abcdefg' * 1000000, b'aaaaaaa' * 1000000
>>> timeit.timeit('it(b1, b2)', 'from __main__ import b1, b2, bxor_ba as it', number=10)
16.18445999799951
回答3:
Adding this in another answer, 'cause it is one:
If you want something faster than the "manual" methods given, there's always Numpy:
import numpy
def bxor_numpy(b1, b2):
n_b1 = numpy.fromstring(b1, dtype='uint8')
n_b2 = numpy.fromstring(b2, dtype='uint8')
return (n_b1 ^ n_b2).tostring()
and it's fast:
first_random = urandom(100000)
second_random = urandom(100000)
min(Timer(partial(bxor_inplace, first_random, second_random)).repeat(10, 100))
#>>> 1.5381054869794752
min(Timer(partial(bxor_append, first_random, second_random)).repeat(10, 100))
#>>> 1.5624085619929247
min(Timer(partial(bxor_numpy, first_random, second_random)).repeat(10, 100))
#>>> 0.009930026979418471
So it's 150x faster than the best alternatives posted here.
回答4:
Martijn Pieters' timings are a bit different to mine:
def bxor_add(b1, b2): # use xor for bytes
result = b""
for b1, b2 in zip(b1, b2):
result += bytes([b1 ^ b2])
return result
def bxor_inplace(b1, b2):
result = bytearray(b1)
for i, b in enumerate(b2):
result[i] ^= b
return bytes(result)
def bxor_join(b1, b2): # use xor for bytes
parts = []
for b1, b2 in zip(b1, b2):
parts.append(bytes([b1 ^ b2]))
return b''.join(parts)
def bxor_append(b1, b2): # use xor for bytes
result = bytearray()
for b1, b2 in zip(b1, b2):
result.append(b1 ^ b2)
return bytes(result)
#>>>
from os import urandom
from timeit import Timer
from functools import partial
first_random = urandom(200000)
second_random = urandom(200000)
Timer(partial(bxor_add, first_random, second_random)).timeit(1)
#>>> 1.3261873809969984
Timer(partial(bxor_inplace, first_random, second_random)).timeit(1)
#>>> 0.03055390200461261
Timer(partial(bxor_join, first_random, second_random)).timeit(1)
#>>> 0.15852201101370156
Timer(partial(bxor_append, first_random, second_random)).timeit(1)
#>>> 0.030534288001945242
first_random = urandom(10000000)
second_random = urandom(10000000)
Timer(partial(bxor_inplace, first_random, second_random)).timeit(1)
#>>> 1.5432947289955337
Timer(partial(bxor_join, first_random, second_random)).timeit(1)
#>>> 7.90503858300508
Timer(partial(bxor_append, first_random, second_random)).timeit(1)
#>>> 1.5145326450001448
I'd go with the append
version for clarity and speed.
For clarification, I don't think the append
method is meaningfully faster than the inplace
version; I just think it's a tiny bit more straightforward.
Nevertheless, because it was requested:
first_random = urandom(100000)
second_random = urandom(100000)
min(Timer(partial(bxor_inplace, first_random, second_random)).repeat(10, 100))
#>>> 1.5381054869794752
min(Timer(partial(bxor_append, first_random, second_random)).repeat(10, 100))
#>>> 1.5196998479950707
来源:https://stackoverflow.com/questions/23312571/fast-xoring-bytes-in-python-3