Are there any 'tricks' to speed up sampling of a very large knapsack combination type prob?

前端 未结 9 2005
情话喂你
情话喂你 2020-12-29 10:21

UPDATE: I have realized the problem below is not possible to answer in its current form because of the large amount of data involved(15k+ items). I just found out, the

9条回答
  •  执念已碎
    2020-12-29 10:33

    Your code does not match your problem statement and it is therefore unclear how to proceed.

    You say that the data list contains negative values and contains duplicates. You give an example which does both. In fact, the values are limited to non-zero integers in the range [-200,200] but the data list is at least 2,000 and typically 10,000 or more, so there would have to be duplicates.

    Let's review your "basic logic":

    for (int c = 100; c >= 0; c--) {
        if (c * x_k == current.sum) { //if result is correct then save
            solutions.add(new Context(0, 0, newcoeff));
            continue;
         } else if (current.k > 0) { // recurse with next data element
             contexts.add(new Context(current.k - 1, current.sum - c * x_k, newcoeff));
         }
    }
    

    Elsewhere you state that the data must be sorted in numerical order and you start from the tail of the list, k = n -1 (because of zero indexing), so you start with the biggest ones first. The then clause terminates the recursion. While this may be fine in the problem you are solving, it is not the problem you are describing, because it ignores all the combinations of lesser data values that sum to zero.

    On the other hand, all the combinations of greater values that sum to zero would be included.

    Let's look, for example, at the last item on your example list, 156, with target sum 5000.

    156 * 100 = 15600 so it will not match the target sum until you get into the negative numbers. Of course

    (100 * -100) + (100 * -6) + (100 * 156) = 5000
    

    and this combination works. (Your sample data set does not include a -100, but it does have two -40s and a -20, so if you want to be true to the data set combine them instead. I'm using -100 to keep the example simple and because you say the data set could include -100.)

    But of course

    (100 * -100) + (100 * -6) + (c * -1) + (c * 1) + (100 * 156) = 5000 
    

    for any c, so you will have 100 combinations like this in the output (1 <= c <= 100). But you have 50 in the data set. When you get to 100 * 50 = 5000 you terminate the recursion, so you will never get

    (c * -1) + (c * 1) + (100 * 50) = 5000 
    

    So either your code or your problem statement is buggy. Probably both, because even without considering the coefficients, 10,000 items taken 60 at a time yields on the order of 10^158 combinations, but aside from this premature termination of recursion, I see nothing that would prevent you from having to test the value of the sum of all those combinations, and even if there were zero cost in computing the values, you could not do that many comparisons.

提交回复
热议问题