C#: Code to fit LOTS of files onto a DVD as efficiently as possible

二次信任 提交于 2019-12-05 05:28:00

Simple algorithm:

  1. Sort the file list by file size
  2. Find the largest file smaller than the remaining free space on the DVD, and add it to the DVD.
  3. If the remaining DVD free space is smaller than any remaining files, start a new dvd.
  4. Repeat from 2.

What you're facing is related to the knapsack problem. The linked wikipedia page has lots more information, including suggested ways of solving it.

For anyone still interested in this question... I wrote a utility which I used for a similar purpose of fitting files into a set of disks/discs. It uses a command-line/file-based interface. Versions are available in C, C++, & Java (not C#).

http://whizman.com/code/diskfit.tgz

More detailed information is in the diskfit.tgz:Doc/diskfit.txt file.

(AGPL3)

We might characterize the question as 0-1 multiple-knapsack, or linear bin packing. (Thanks jon-skeet for the link about knapsack problem.)

Dthorpe solves linear bin packing, for exactly enough bins/disks to fit all files [nicely O(n) or O(n lg n) fast - also may be feasible in spreadsheet without having to write a script].

Basically, diskfit (above-linked utility) outputs qualifying file-sets based around 0-1 single-knapsack, and the user chooses one-disk file-sets to assemble into the disk-set - assisting the user (but not fully automating) toward both:

  • linear bin packing - for the complete disk set;
  • 0-1 multiple-knapsack - for each subset of disks 1..k of the full disk set (where files are prioritized, aka differ in value).

Full programmatic choice of the complete such disk-set, would be an additional feature. It would be insufficient to apply 0-1 single-knapsack solution, automatically disc by disc [greedily]. (Consider 3 knapsacks of capacity 6, and available items with equal value and weights of: {1, 1, 2, 2, 3, 4, 5}. Applying 0-1 knapsack to the first knapsack in isolation would choose {1, 1, 2, 2} to obtain sum value 4 - after which we cannot fit all of the remaining 3 items in the remaining second & third knapsacks - whereas we know we can fit all items in the 3 knapsacks as {1, 2, 3} & {1, 5} & {2, 4}.)

for each file
 is there enough room this dvd?
   yes, store it here
   no, is there room on another already allocated dvd?
     yes, store it there
     no, allocate another dvd and store it there

While thats a cool problem to solve in a program for certain applications... however in your application, why not just use WinRAR or some other archiving program that has the capability to split up the archive into specific sized file chunks. You could make each chunk the size of a DVD and then just burn away.

EDIT: one issue you would run into is that if one of your files is greater than the size of your media, you are not going to be able to burn that file.

How about if you started by putting as many of the largest files you can onto one DVD and then filling it up with as many of the smallest files that you can (starting with the smallest).

Repeat this process with the remaining files for each disk.

I'm not sure that's going to give you perfect coverage/distribution but I think it might go some way to solving your needs.

adi serbane

use backtracking to get the optimal set of files to burn to dvd 1, then exclude them from the list and use backtracking on the remaining files to get the optimal fill for dvd 2 and so on

I've found a lot of tools that are supposed to solve this problem, but they all try to minimize the TOTAL number of disc used, while I was just interested into the SINGLE subset of files that best fit a SINGLE disc.

So i've ended writing my own tool called "ss" (from the "subset sum" algorithm which is based from). The tool is still buggy and can't recurse directories, but it's working for me. :)

This problem is the Bin Packing Problem and is NP complete, which means if you want a truly optimal solution you will need exponential time. However there are methods that give less than optimal solutions but run much faster.

Assume we have an unbounded list of disks. Take each file ordered descending in size, then add each file to the first disk that it fits in. This is called First fit decreasing and takes 11/9 OPT + 6/9 disks in worst case. If you choose files in a random order you instead need 11/9 OPT + 1 disks.

There are algorithms that will pack things tighter, see the wikipedia link above for more details.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!