可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
Given an array of integers find the number of all ordered pairs of elements in the array whose sum lies in a given range [a,b]
Here is an O(n^2) solution for the same
''' counts all pairs in array such that the sum of pair lies in the range a and b ''' def countpairs(array, a, b): num_of_pairs = 0 for i in range(len(array)): for j in range(i+1,len(array)): total = array[i] + array[j] if total >= a and total <= b: num_of_pairs += 1 return num_of_pairs
I know my solution is not optimal What is a better algorithm for doing this.
回答1:
- Sort the array (say in increasing order).
- For each element x in the array:
- Consider the array slice after the element.
- Do a binary search on this array slice for [a - x], call it y0. If no exact match is found, consider the closest match bigger than [a - x] as y0.
- Output all elements (x, y) from y0 forwards as long as x + y <= b
The time complexity is of course output-sensitive, but this is still superior to the existing algo:
O(nlogn) + O(k)
where k is the number of pairs that satisfy the condition.
Note: If you only need to count the number of pairs, you can do it in O(nlogn)
. Modify the above algorithm so [b - x] (or the next smaller element) is also searched for. This way, you can count the number of 'matches' each element has in O(logn)
simply from the indices of the first and last match. Then it's just a question of summing those up to get the final count. This way, the initial O(nlogn)
sorting step is dominant.
回答2:
Sort the array first and count the pairs by two indexes. The two indexes approach is similar to the one in 2-sum problem, which avoids the binary-search for N
times. The time consuming of the algorithm is Sort Complexity + O(N)
, typically, sort is O(NlnN), thus this approach is O(NlnN). The idea of the algorithm is, for an index i
, find an lower bound and an upper bound such that a <= arr[i]+arr[low] <= arr[i]+arr[high] <= b
and when i
increases, what we should do is to decrease low
and high
to hold the condition. To avoid counting the same pair twice, we keep low > i
, also we keep low <= high
. The complexity of the following counting approach is O(N), because, in the while loop
, what we can do is ++i
or --low
or --high
and there are at most N
such operations.
//count pair whose sum is in [a, b] //arr is a sorted array with size integers. int countPair(int arr[], int size, int a, int b) { int cnt = 0; int i = 0, low = size-1, high = size-1; while (i < high) { //find the lower bound such that arr[i] + arr[low] < a, //meanwhile arr[i]+arr[low+1] >= a low = max(i, low); while (low > i && arr[i] + arr[low] >= a) --low; //find an upper bound such that arr[i] + arr[high] <= b //meanwhile, arr[i]+arr[high+1] > b while (high > low && arr[i] + arr[high] > b) --high; //all pairs: arr[i]+arr[low+1], arr[i]+arr[low+2],...,arr[i]+arr[high] //are in the rage[a, b], and we count it as follows. cnt += (high-low); ++i; } return cnt; }
回答3:
The problem of counting the pairs that work can be done in sort time + O(N). This is faster than the solution that Ani gives, which is sort time + O(N log N). The idea goes like this. First you sort. You then run nearly the same single pass algorithm twice. You then can use the results of the two single pass algorithms to calculate the answer.
The first time we run the single pass algorithm, we will create a new array that lists the smallest index that can partner with that index to give a sum greater than a. Example:
a = 6 array = [-20, 1, 3, 4, 8, 11] output = [6, 4, 2, 2, 1, 1]
So, the number at array index 1 is 1 (0 based indexing). The smallest number it can pair with to get over 6 is the eight, which is at index 4. Hence output[1] = 4. -20 can't pair with anything, so output[0] = 6 (out of bounds). Another example: output[4] = 1, because 8 (index 4) can pair with the 1 (index 1) or any number after it to sum more than 6.
What you need to do now is convince yourself that this is O(N). It is. The code is:
i, j = 0, 5 while i - j <= 0: if array[i] + array[j] >= a: output[j] = i j -= 1 else: output[i] = j + 1 i += 1
Just think of two pointers starting at the edges and working inwards. It's O(N). You now do the same thing, just with the condition b <= a:
while i-j <= 0: if array[i] + array[j] <= b: output2[i] = j i += 1 else: output2[j] = i-1 j-=1
In our example, this code gives you (array and b for reference):
b = 9 array = [-20, 1, 3, 4, 8, 11] output2 = [5, 4, 3, 3, 1, 0]
But now, output and output2 contain all the information we need, because they contain the range of valid indices for pairings. output is the smallest index it can be paired with, output2 is the largest index it can be paired with. The difference + 1 is the number of pairings for that location. So for the first location (corresponding to -20), there are 5 - 6 + 1 = 0 pairings. For 1, there are 4-4 + 1 pairings, with the number at index 4 which is 8. Another subtlety, this algo counts self pairings, so if you don't want it, you have to subtract. E.g. 3 seems to contain 3-2 + 1 = 2 pairings, one at index 2 and one at index 3. Of course, 3 itself is at index 2, so one of those is the self pairing, the other is the pairing with 4. You just need to subtract one whenever the range of indices of output and output2 contain the index itself you're looking at. In code, you can write:
answer = [o2 - o + 1 - (o <= i <= o2) for i, (o, o2) in enumerate(zip(output, output2))]
Which yields:
answer = [0, 1, 1, 1, 1, 0]
Which sums to 4, corresponding to (1,8), (3,4), (4,3), (8, 1)
Anyhow, as you can see, this is sort + O(N), which is optimal.
Edit: asked for full implementation. Provided. For reference, the full code:
def count_ranged_pairs(x, a, b): x.sort() output = [0] * len(x) output2 = [0] * len(x) i, j = 0, len(x)-1 while i - j <= 0: if x[i] + x[j] >= a: output[j] = i j -= 1 else: output[i] = j + 1 i += 1 i, j = 0, len(x) - 1 while i-j <= 0: if x[i] + x[j] <= b: output2[i] = j i += 1 else: output2[j] = i-1 j -=1 answer = [o2 - o + 1 - (o <= i <= o2) for i, (o, o2) in enumerate(zip(output, output2))] return sum(answer)/2
回答4:
from itertools import ifilter, combinations def countpairs2(array, a, b): pairInRange = lambda x: sum(x) >= a and sum(x) <= b filtered = ifilter(pairInRange, combinations(array, 2)) return sum([2 for x in filtered])
I think the Itertools library comes in quite handy. I also noticed you counted pairs twice, for example you counted (1, 3) and (3, 1) as two different combinations. If you don't want that, just change the 2 in the last line to a 1. Note: The last could be changed to return len(list(filtered)) * 2
. This CAN be faster, but at the expense of using more RAM.
回答5:
With some constraints on the data we can solve problem in linear time (sorry for Java, I'm not very proficient with Python):
public class Program { public static void main(String[] args) { test(new int[]{-2, -1, 0, 1, 3, -3}, -1, 2); test(new int[]{100,200,300}, 300, 300); test(new int[]{100}, 1, 1000); test(new int[]{-1, 0, 0, 0, 1, 1, 1000}, -1, 2); } public static int countPairs(int[] input, int a, int b) { int min = Integer.MAX_VALUE; int max = Integer.MIN_VALUE; for (int el : input) { max = Math.max(max, el); min = Math.min(min, el); } int d = max - min + 1; // "Diameter" of the array // Build naive hash-map of input: Map all elements to range [0; d] int[] lookup = new int[d]; for (int el : input) { lookup[el - min]++; } // a and b also needs to be adjusted int a1 = a - min; int b1 = b - min; int[] counts = lookup; // Just rename // i-th element contain count of lookup elements in range [0; i] for (int i = 1; i < counts.length; ++i) { counts[i] += counts[i - 1]; } int res = 0; for (int el : input) { int lo = a1 - el; // el2 >= lo int hi = b1 - el; // el2 <= hi lo = Math.max(lo, 0); hi = Math.min(hi, d - 1); if (lo <= hi) { res += counts[hi]; if (lo > 0) { res -= counts[lo - 1]; } } // Exclude pair with same element if (a <= 2*el && 2*el <= b) { --res; } } // Calculated pairs are ordered, divide by 2 return res / 2; } public static int naive(int[] ar, int a, int b) { int res = 0; for (int i = 0; i < ar.length; ++i) { for (int j = i + 1; j < ar.length; ++j) { int sum = ar[i] + ar[j]; if (a <= sum && sum <= b) { ++res; } } } return res; } private static void test(int[] input, int a, int b) { int naiveSol = naive(input, a, b); int optimizedSol = countPairs(input, a, b); if (naiveSol != optimizedSol) { System.out.println("Problem!!!"); } } }
For each element of the array we know the range in which second element of the pair can lay. Core of this algorithm is giving the count of elements in range [a; b] in O(1) time.
Resulting complexity is O(max(N, D)), where D is difference between max and min elements of the array. If this value is same order as N - complexity is O(N).
Notes:
- No sorting involved!
- Building lookup is required to make algorithm work with negative numbers and make second array as small as possible (positively impacts both memory and time)
- Ugly condition
if (a <= 2*el && 2*el <= b)
is required because algorithm always counts pairs (a[i],a[i]) - Algorithm requires O(d) additional memory which can be a lot.
Another linear algorithm would be radix sort + linear pair counting.
EDIT. This algorithm can be really good in case if D is considerably smaller than N and you are not allowed to modify the input array. Alternative option for this case would be slightly modified counting sort with allocation of counts array (additional O(D) memory) but without populating sorted elements back to input array. It's possible to adapt pair counting to use counts array instead of full sorted array.
回答6:
I have a solution(actually 2 solutions ;-)). Writing it in python:
def find_count(input_list, min, max): count = 0 range_diff = max - min for i in range(len(input_list)): if input_list[i]*2 >= min and input_list[i]*2 <= max: count += 1 for j in range(i+1, len(input_list)): input_sum = input_list[i] + input_list[j] if input_sum >= min and input_sum <= max: count += 2
This will run nCr(n combinations) times to the max and gives you the required count. This will be better than sorting the list and then finding the pairs in a range. If the number of elements that fail the combination is greater as well as all the numbers are positive integers, we can improve the result a little better by adding a condition that checks the elements for,
- Numbers that do not fall under the range even with the addition of the max value
- Numbers that are greater than the maximum number of the range.
Something like this:
# list_maximum is the maximum number of the list (i.e) max(input_list), if already known def find_count(input_list, min, max, list_maximum): count = 0 range_diff = max - min for i in range(len(input_list)): if input_list[i] > max or input_list[i] + list_maximum < min: continue if input_list[i]*2 >= min and input_list[i]*2 <= max: count += 1 for j in range(i+1, len(input_list)): input_sum = input_list[i] + input_list[j] if input_sum >= min and input_sum <= max: count += 2
I will also be happy to learn any better solution than this :-) If i come across one, I will update this answer.
回答7:
I believe this is a simple math problem, that could be solved with numpy
with no loops and no sorting on our part. I'm not exactly sure, but I believe the complexity to be O(N^2) at worse case (would love some confirmation on that by someone more knowledgeable with time complexities in numpy).
At any rate, here's my solution:
import numpy as np def count_pairs(input_array, min, max): A = np.array(input_array) A_ones = np.ones((len(A),len(A))) A_matrix = A*A_ones result = np.transpose(A_matrix) + A_matrix result = np.triu(result,0) np.fill_diagonal(result,0) count = ((result > min) & (result < max)).sum() return count
Now let's walk through it - first I just create a matrix with columns representing our numbers:
A = np.array(input_array) A_ones = np.ones((len(A),len(A))) A_matrix = A*A_ones
Let's assume that our input array looked like: [1,1,2,2,3,-1]
,thus, this should be the value of A_matrix
at this point.
[[ 1. 1. 2. 2. 3. -1.] [ 1. 1. 2. 2. 3. -1.] [ 1. 1. 2. 2. 3. -1.] [ 1. 1. 2. 2. 3. -1.] [ 1. 1. 2. 2. 3. -1.] [ 1. 1. 2. 2. 3. -1.]]
If I add that to the transpose of itself...
result = np.transpose(A_matrix) + A_matrix
...I should get a matrix representing all combinations of sums of pairs:
[[ 2. 2. 3. 3. 4. 0.] [ 2. 2. 3. 3. 4. 0.] [ 3. 3. 4. 4. 5. 1.] [ 3. 3. 4. 4. 5. 1.] [ 4. 4. 5. 5. 6. 2.] [ 0. 0. 1. 1. 2. -2.]]
Of course, this matrix is mirrored across the diagonal because the pairs (1,2) and (2,1) produce the same result. We don't want to consider these duplicate entries. We also don't want to consider the sum of an item with itself, so let's sanitize our array:
result = np.triu(result,0) np.fill_diagonal(result,0)
Our result now looks like:
[[ 0. 2. 3. 3. 4. 0.] [ 0. 0. 3. 3. 4. 0.] [ 0. 0. 0. 4. 5. 1.] [ 0. 0. 0. 0. 5. 1.] [ 0. 0. 0. 0. 0. 2.] [ 0. 0. 0. 0. 0. 0.]]
All that remains is to count the items that pass our criteria.
count = ((result > min) & (result < max)).sum()
A word of caution:
This method won't work if 0
is in the acceptable domain, but I'm sure it would be trivial to manipulate that result matrix above to convert those 0's to some other meaningless number....
回答8:
Rather than using the relational operators, we can simply check if the sum of array elements i and j are in the specified range.
def get_numOfPairs(array, start, stop): num_of_pairs = 0 array_length = len(array) for i in range(array_length): for j in range(i+1, array_length): if sum([array[i], array[j]]) in range(start, stop): num_of_pairs += 1 return num_of_pairs