Filling bins with an equal size

不羁岁月 提交于 2020-01-11 11:38:09

问题


I have 100 groups and each group has some elements inside. For the cross validation, I want to make five bins which their size is as equal as possible.

Is there any algorithm for this purpose.

An example for 5 groups and 2 bins:

Group_1: 5
Group_2: 6
Group_3: 2
Group_4: 7
Group_5: 1

The two bins will be:

G1 and G2 -> their sum is equal to 11.

G3, G4 and G5 -> their sum is equal to 10.


回答1:


This seems related to the set partitioning problem, which is NP-hard but fortunately admits lots of good approximation algorithms and pseudopolynomial-time dynamic programming algorithms. You may want to look into those as a starting point, since there's already quite a lot of work that's been done in this area.

Hope this helps!




回答2:


This is not a cluster analysis problem (I rewrote the question to use the more appropriate wording for you). Cluster analysis is a structure discovery task.

Instead, have a look at the following two related problems from computer science:

  • Multiprocessor scheduling seems to be what you need: given n processors, distribute the tasks such that the least time is unused
  • Bin packing problem is a classic NP-hard problem, solving the reverse problem: use as few bins of fixed size to accomodate all tasks.
  • k-Partition Problem this is probably what you want to do.

All of these appear to be NP-hard, so you will want to use an approximation only (if you have large data, with just 5 examples you can easily brute-force all combinations)




回答3:


If you're looking for a clustering algorithm (partitioning method) with equal size constraint, I would suggest the Spectral Clustering. It will satisfy your demand for clusters with almost the same sizes because it solves the normalized cut problem, which try to find a balanced cut.



来源:https://stackoverflow.com/questions/27338915/filling-bins-with-an-equal-size

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!