问题
What is the best way in Python to determine what values in two ranges overlap?
For example:
x = range(1,10)
y = range(8,20)
(The answer I am looking for would be the integers 8 and 9.)
Given a range, x, what is the best way to iterate through another range, y and output all values that are shared by both ranges? Thanks in advance for the help.
EDIT:
As a follow-up, I realized that I also need to know if x does or does not overlap y. I am looking for a way to iterate through a list of ranges and and do a number of additional things with range that overlap. Is there a simple True/False statement to accomplish this?
回答1:
Try with set intersection:
>>> x = range(1,10)
>>> y = range(8,20)
>>> xs = set(x)
>>> xs.intersection(y)
set([8, 9])
Note that intersection
accepts any iterable as an argument (y
is not required to be converted to a set for the operation).
There is an operator equivalent to the intersection
method: &
but, in this case, it requires both arguments to be sets.
回答2:
If the step is always +1 (which is the default for range) the following should be more efficient than converting each list to a set or iterating over either list:
range(max(x[0], y[0]), min(x[-1], y[-1])+1)
回答3:
You can use sets for that, but be aware that set(list)
removes all duplicate entries from the list
:
>>> x = range(1,10)
>>> y = range(8,20)
>>> list(set(x) & set(y))
[8, 9]
回答4:
One option is to just use list comprehension like:
x = range(1,10)
y = range(8,20)
z = [i for i in x if i in y]
print z
回答5:
For "if x does or does not overlap y" :
for a,b,c,d in ((1,10,10,14),
(1,10,9,14),
(1,10,4,14),
(1,10,4,10),
(1,10,4,9),
(1,10,4,7),
(1,10,1,7),
(1,10,-3,7),
(1,10,-3,2),
(1,10,-3,1),
(1,10,-11,-5)):
x = range(a,b)
y = range(c,d)
print 'x==',x
print 'y==',y
b = not ((x[-1]<y[0]) or (y[-1]<x[0]))
print ' x %s y' % ("does not overlap"," OVERLAPS ")[b]
print
result
x== [1, 2, 3, 4, 5, 6, 7, 8, 9]
y== [10, 11, 12, 13]
x does not overlap y
x== [1, 2, 3, 4, 5, 6, 7, 8, 9]
y== [9, 10, 11, 12, 13]
x OVERLAPS y
x== [1, 2, 3, 4, 5, 6, 7, 8, 9]
y== [4, 5, 6, 7, 8, 9, 10, 11, 12, 13]
x OVERLAPS y
x== [1, 2, 3, 4, 5, 6, 7, 8, 9]
y== [4, 5, 6, 7, 8, 9]
x OVERLAPS y
x== [1, 2, 3, 4, 5, 6, 7, 8, 9]
y== [4, 5, 6, 7, 8]
x OVERLAPS y
x== [1, 2, 3, 4, 5, 6, 7, 8, 9]
y== [4, 5, 6]
x OVERLAPS y
x== [1, 2, 3, 4, 5, 6, 7, 8, 9]
y== [1, 2, 3, 4, 5, 6]
x OVERLAPS y
x== [1, 2, 3, 4, 5, 6, 7, 8, 9]
y== [-3, -2, -1, 0, 1, 2, 3, 4, 5, 6]
x OVERLAPS y
x== [1, 2, 3, 4, 5, 6, 7, 8, 9]
y== [-3, -2, -1, 0, 1]
x OVERLAPS y
x== [1, 2, 3, 4, 5, 6, 7, 8, 9]
y== [-3, -2, -1, 0]
x does not overlap y
x== [1, 2, 3, 4, 5, 6, 7, 8, 9]
y== [-11, -10, -9, -8, -7, -6]
x does not overlap y
Edit 1
Speeds comparison:
from time import clock
x = range(-12,15)
y = range(-5,3)
te = clock()
for i in xrange(100000):
w = set(x).intersection(y)
print ' set(x).intersection(y)',clock()-te
te = clock()
for i in xrange(100000):
w = range(max(x[0], y[0]), min(x[-1], y[-1])+1)
print 'range(max(x[0], y[0]), min(x[-1], y[-1])+1)',clock()-te
result
set(x).intersection(y) 0.951059981087
range(max(x[0], y[0]), min(x[-1], y[-1])+1) 0.377761978129
The ratio of these execution's times is 2.5
回答6:
If you looking for the overlap between two real-valued bounded intervals, then this is quite nice:
def overlap(start1, end1, start2, end2):
"""how much does the range (start1, end1) overlap with (start2, end2)"""
return max(max((end2-start1), 0) - max((end2-end1), 0) - max((start2-start1), 0), 0)
I couldn't find this online anywhere so I came up with this and I'm posting here.
回答7:
The answers above seem mostly overly complex. This one liner works perfectly in Python3, takes ranges as inputs and output. It also handles illegal ranges. To get the values just iterate over the result if not None.
# return overlap range for two range objects or None if no ovelap
# does not handle step!=1
def range_intersect(r1, r2):
return range(max(r1.start,r2.start), min(r1.stop,r2.stop)) or None
回答8:
This is the answer for the simple range with step=1 case (99% of the time), which can be 2500x faster as shown in the benchmark when comparing long ranges using sets (when you are just interested in knowing if there is overlap):
x = range(1,10)
y = range(8,20)
def range_overlapping(x, y):
if x.start == x.stop or y.start == y.stop:
return False
return ((x.start < y.stop and x.stop > y.start) or
(x.stop > y.start and y.stop > x.start))
>>> range_overlapping(x, y)
True
To find the overlapping values:
def overlap(x, y):
if not range_overlapping(x, y):
return set()
return set(range(max(x.start, y.start), min(x.stop, y.stop)+1))
Visual help:
| | | |
| | | |
Benchmark:
x = range(1,10)
y = range(8,20)
In [151]: %timeit set(x).intersection(y)
2.74 µs ± 11.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [152]: %timeit range_overlapping(x, y)
1.4 µs ± 2.91 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
Conclusion: even for small ranges, it is twice as fast.
x = range(1,10000)
y = range(50000, 500000)
In [155]: %timeit set(x).intersection(y)
43.1 ms ± 158 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)
In [156]: %timeit range_overlapping(x, y)
1.75 µs ± 88.3 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
Conclusion: you want to use the range_overlapping function in this case as it is 2500x faster (my personal record in speedup)
回答9:
If you want to find the overlap of ranges with arbitrary steps you can use my package https://github.com/avnr/rangeplus which provides a Range() class compatible with Python range(), plus some goodies including intersections:
>>> from rangeplus import Range
>>> Range(1, 100, 3) & Range(2, 100, 4)
Range(10, 100, 12)
>>> Range(200, -200, -7) & range(5, 80, 2) # can intersect with Python range() too
Range(67, 4, -14)
Range() can also be unbound (when stop is None the Range goes on to +/-infinity):
>>> Range(1, None, 3) & Range(3, None, 4)
Range(7, None, 12)
>>> Range(253, None, -3) & Range(208, 310, 5)
Range(253, 207, -15)
The intersection is computed, not iterated, which makes the efficiency of the implementation independent of the length of the Range().
回答10:
Assuming you are working exclusively with ranges, with a step of 1
, you can do it quickly with math.
def range_intersect(range_x,range_y):
if len(range_x) == 0 or len(range_y) == 0:
return []
# find the endpoints
x = (range_x[0], range_x[-1]) # from the first element to the last, inclusive
y = (range_y[0], range_y[-1])
# ensure min is before max
# this can be excluded if the ranges must always be increasing
x = tuple(sorted(x))
y = tuple(sorted(y))
# the range of the intersection is guaranteed to be from the maximum of the min values to the minimum of the max values, inclusive
z = (max(x[0],y[0]),min(x[1],y[1]))
if z[0] < z[1]:
return range(z[0], z[1] + 1) # to make this an inclusive range
else:
return [] # no intersection
On a pair of ranges each with over 10^7 elements, this took under a second, independent of how many elements overlapped. I tried with 10^8 or so elements, but my computer froze for a while. I doubt you'd be working with lists that long.
来源:https://stackoverflow.com/questions/6821156/how-to-find-range-overlap-in-python