Which is the best method between Genetic Algorithm and Dynamic Programming to solve classic 0-1 knapsack? [closed]

问题

Let's say I have problem like this:

Knapsack capacity = 20 million
Number of item = 500
Weight of each item is random number between 100 to 20 million
Profit of each item is random number between 1 to 10

So which is the best method for my problem? GA or Dynamic Programming?

Please give me a simple explanation, as I'm newbie in this..

回答1:

Dynamic Programming (DP):

Exact algorithm - Finds global optimal solution
Long running time
Uses a lot of memory
Very simple to implement

Genetic Algorithm (GA):

Estimation - Doesn't necessarily find the global optimal solution
Short running time
Memory usage depends on number of individuals but is generally managable
Quality of solution depends on choosing an efficient representation + letting it run long enough
Reasonably simple to implement, design decisions may be a little more complex, especially if you don't have significant experience with GAs

Hill climbing:

Estimation - Doesn't necessarily find the global optimal solution. More subject to halting at a local optimum than a GA, though there are ways to reduce the chance of this happening
Short running time
Very low memory usage
Very simple to implement

DP (or any exact algorithm for an NP-complete problem) is generally only a good idea for a reasonably small problem, or if finding the global optimal is the most important thing.

There are 2 approaches to DP: (there is a reasonably simple optimization where you only store 2 rows, my memory usage analysis goes on the assumption that this optimization is used)

Have a matrix of items x weight, with cell value being the maximum value

Matrix size = 500 x 20 000 000

Running time = O(500 * 20 000 000) = O(10 000 000 000)

Memory = max 10 * 500 -> 5 000 -> short = 2 bytes -> 2 * 20 000 000 * 2 = 80 000 000 < 80 MB

Explanation: A[i,j] below represents the best (highest) value obtainable from any subset of the elements 1 to i with weight less than or equal to j. The update rule below means - find the best value between either not including the current element (thus the weight and value stays the same) or including it (so lookup the optimal value for (the current weight minus the current item's weight) and add the value of the current item to it). Then just return the A[500, 20000000], which represents the highest value obtainable from any subset of all elements with a maximum weight of the knapsack size.

Algorithm:
```
A[0, 0..20000000] = 0
for i = 1:500
for x = 0:20000000
  A[i, x] = max(A[i-1, x], value(i) + A[i-1, x-weight(i)])
// ignore value(i) + A[i-1, x-weight(i)] when weight(i) > x
return A[500, 20000000]
```
Have a matrix of items x value, with cell value being the minimum weight

Matrix size = 500 x 10*500

Running time = O(500 * 10*500) = O(2 500 000)

Memory = max 20 000 000 -> int = 4 bytes -> 2 * 500 * 4 = 4 000 < 4 KB

Explanation: A[i,j] below represents the lowest weight obtainable from any subset of the elements 1 to i with value equal to j. The update rule below means - find the best value between either not including the current element (thus the weight and value stays the same) or including it (so lookup the optimal value for (the current value minus the current item's value) and add the weight of the current item to it). The value at any cell is the exact weight of a subset to produce that value, so we need to look through all the cells A[500, x], which represents minimum weight subsets of elements for any value x.

Algorithm:
```
A[0, 0] = 0
A[0, 1..5000] = ∞
for i = 1:500
for x = 0:5000
  A[i, x] = min(A[i-1, x], weight(i) + A[i-1, x-value(i)])
// interpret A[i-1, x-value(i)] as 0 when value(i) > x
return largest x that A[500, x] <= 20000000
```

So, yeah, the complexities pretty much speak for themselves, you'll wait a few hours for the first way, but mere seconds for the second, and there's a similar difference in memory usage (though 80 MB is still near negligible) (note that this is FAR from a rule, every case needs to be analysed in its own right).

回答2:

Dynamic programming can run in time O(numItems * knapsackCapacity) and O(knapsackCapacity) memory. This means that, for your specifications, you would have:

about 20.000.000 x 500 = 10.000.000.000 operations - would probably finish executing in a few hours, depending on your machine;
since the profit of one item is at most 10 and you can have at most 500 items, that means your maximum theoretical profit cannot exceed 10 x 500 = 5000. This can be represented on 2 bytes (a short). So you will also need 2 x 20.000.000 / 1024 / 1024 ~= 38 MB of memory to store your DP array. Again, not that much.

The others have already mentioned the advantages and disadvantages of each. GA will finish faster, but the DP solution should also finish in a few hours, and it will give you an exact answer.

Personally, I would choose DP if waiting a few hours for the solution is not a problem.

Note: here is the DP that runs with O(knapsackCapacity) memory:

dp[i] = maximum profit obtainable for weight i
for i = 1 to numItems do
  for j = knapsackCapacity down to items[i].weight do
    dp[j] = max(dp[j], dp[j - items[i].weight] + items[i].profit)

回答3:

There is no "best" method, each has it's pros and cons.
In this case the trade off is between optimality of the solution found - GA does not in any way guarantee to find the best solution.
And the time/space required to compute the solution - using dynamic programming will save you some redundant computations but you will still have to compute all possible combinations at least once to find the best solution (or maybe even any solution at all).

回答4:

First you need to consider Dynamic programming as an exact algorithm which can guarantee that the answer is going to be an optimum answer. On the other hand GA is a heuristic algorithm which usually converge to a local optima. There is a dynamic programming solution for the knapsack which is pseudo linear (O(capacity*number_of_items)) if they are integers. In your case, you need about 1e10 operation and memory. Which are feasible for a single computer. So if finding the optimum answer is important for you and you have enough time (in the order of a couple of hours) you can use a dynamic programming approach. Otherwise, you may prefer to use a heuristic algorithm, such as GA.

来源：https://stackoverflow.com/questions/14848874/which-is-the-best-method-between-genetic-algorithm-and-dynamic-programming-to-so

标签

algorithm

dynamic

genetic

knapsack-problem