How can I obtain all the possible combination of a subset?

坚强是说给别人听的谎言 提交于 2019-12-17 14:55:02

问题


Consider this List<string>

List<string> data = new List<string>();
data.Add("Text1");
data.Add("Text2");
data.Add("Text3");
data.Add("Text4");

The problem I had was: how can I get every combination of a subset of the list? Kinda like this:

#Subset Dimension 4
Text1;Text2;Text3;Text4

#Subset Dimension 3
Text1;Text2;Text3;
Text1;Text2;Text4;
Text1;Text3;Text4;
Text2;Text3;Text4;

#Subset Dimension 2
Text1;Text2;
Text1;Text3;
Text1;Text4;
Text2;Text3;
Text2;Text4;

#Subset Dimension 1
Text1;
Text2;
Text3;
Text4;

I came up with a decent solution which a think is worth to share here.


回答1:


I think, the answers in this question need some performance tests. I'll give it a go. It is community wiki, feel free to update it.

void PerfTest()
{
    var list = Enumerable.Range(0, 21).ToList();

    var t1 = GetDurationInMs(list.SubSets_LB);
    var t2 = GetDurationInMs(list.SubSets_Jodrell2);
    var t3 = GetDurationInMs(() => list.CalcCombinations(20));

    Console.WriteLine("{0}\n{1}\n{2}", t1, t2, t3);
}

long GetDurationInMs(Func<IEnumerable<IEnumerable<int>>> fxn)
{
    fxn(); //JIT???
    var count = 0;

    var sw = Stopwatch.StartNew();
    foreach (var ss in fxn())
    {
        count = ss.Sum();
    }
    return sw.ElapsedMilliseconds;
}

OUTPUT:

1281
1604 (_Jodrell not _Jodrell2)
6817

Jodrell's Update

I've built in release mode, i.e. optimizations on. When I run via Visual Studio I don't get a consistent bias between 1 or 2, but after repeated runs LB's answer wins, I get answers approaching something like,

1190
1260
more

but if I run the test harness from the command line, not via Visual Studio, I get results more like this

987
879
still more



回答2:


Similar logic as Abaco's answer, different implementation....

foreach (var ss in data.SubSets_LB())
{
    Console.WriteLine(String.Join("; ",ss));
}

public static class SO_EXTENSIONS
{
    public static IEnumerable<IEnumerable<T>> SubSets_LB<T>(
      this IEnumerable<T> enumerable)
    {
        List<T> list = enumerable.ToList();
        ulong upper = (ulong)1 << list.Count;

        for (ulong i = 0; i < upper; i++)
        {
            List<T> l = new List<T>(list.Count);
            for (int j = 0; j < sizeof(ulong) * 8; j++)
            {
                if (((ulong)1 << j) >= upper) break;

                if (((i >> j) & 1) == 1)
                {
                    l.Add(list[j]);
                }
            }

            yield return l;
        }
    }
}



回答3:


EDIT

I've accepted the performance gauntlet, what follows is my amalgamation that takes the best of all answers. In my testing, it seems to have the best performance yet.

public static IEnumerable<IEnumerable<T>> SubSets_Jodrell2<T>(
    this IEnumerable<T> source)
{
    var list = source.ToList();
    var limit = (ulong)(1 << list.Count);

    for (var i = limit; i > 0; i--)
    {
        yield return list.SubSet(i);
    }
}

private static IEnumerable<T> SubSet<T>(
    this IList<T> source, ulong bits)
{
    for (var i = 0; i < source.Count; i++)
    {
        if (((bits >> i) & 1) == 1)
        {
            yield return source[i];
        }
    }
}

Same idea again, almost the same as L.B's answer but my own interpretation.

I avoid the use of an internal List and Math.Pow.

public static IEnumerable<IEnumerable<T>> SubSets_Jodrell(
    this IEnumerable<T> source)
{
    var count = source.Count();

    if (count > 64)
    {
        throw new OverflowException("Not Supported ...");
    }

    var limit = (ulong)(1 << count) - 2;

    for (var i = limit; i > 0; i--)
    {
        yield return source.SubSet(i);
    }
}

private static IEnumerable<T> SubSet<T>(
    this IEnumerable<T> source,
    ulong bits)
{
    var check = (ulong)1;
    foreach (var t in source)
    {
        if ((bits & check) > 0)
        {
            yield return t;
        }

        check <<= 1;
    }
}

You'll note that these methods don't work with more than 64 elements in the intial set but it starts to take a while then anyhow.




回答4:


I developed a simple ExtensionMethod for lists:

    /// <summary>
    /// Obtain all the combinations of the elements contained in a list
    /// </summary>
    /// <param name="subsetDimension">Subset Dimension</param>
    /// <returns>IEnumerable containing all the differents subsets</returns>
    public static IEnumerable<List<T>> CalcCombinations<T>(this List<T> list, int subsetDimension)
    {
        //First of all we will create a binary matrix. The dimension of a single row
        //must be the dimension of list 
        //on which we are working (we need a 0 or a 1 for every single element) so row
        //dimension is to obtain a row-length = list.count we have to
        //populate the matrix with the first 2^list.Count binary numbers
        int rowDimension = Convert.ToInt32(Math.Pow(2, list.Count));

        //Now we start counting! We will fill our matrix with every number from 1 
        //(0 is meaningless) to rowDimension
        //we are creating binary mask, hence the name
        List<int[]> combinationMasks = new List<int[]>();
        for (int i = 1; i < rowDimension; i++)
        {
            //I'll grab the binary rapresentation of the number
            string binaryString = Convert.ToString(i, 2);

            //I'll initialize an array of the apropriate dimension
            int[] mask = new int[list.Count];

            //Now, we have to convert our string in a array of 0 and 1, so first we 
            //obtain an array of int then we have to copy it inside our mask 
            //(which have the appropriate dimension), the Reverse()
            //is used because of the behaviour of CopyTo()
            binaryString.Select(x => x == '0' ? 0 : 1).Reverse().ToArray().CopyTo(mask, 0);

            //Why should we keep masks of a dimension which isn't the one of the subset?
            // We have to filter it then!
            if (mask.Sum() == subsetDimension) combinationMasks.Add(mask);
        }

        //And now we apply the matrix to our list
        foreach (int[] mask in combinationMasks)
        {
            List<T> temporaryList = new List<T>(list);

            //Executes the cycle in reverse order to avoid index out of bound
            for (int iter = mask.Length - 1; iter >= 0; iter--)
            {
                //Whenever a 0 is found the correspondent item is removed from the list
                if (mask[iter] == 0)
                    temporaryList.RemoveAt(iter);
            }
            yield return temporaryList;
        }
    }
}

So considering the example in the question:

# Row Dimension of 4 (list.Count)
Binary Numbers to 2^4

# Binary Matrix
0 0 0 1 => skip
0 0 1 0 => skip
[...]
0 1 1 1 => added // Text2;Text3;Text4
[...]
1 0 1 1 => added // Text1;Text3;Text4
1 1 0 0 => skip
1 1 0 1 => added // Text1;Text2;Text4
1 1 1 0 => added // Text1;Text2;Text3
1 1 1 1 => skip

Hope this can help someone :)

If you need clarification or you want to contribute feel free to add answers or comments (which one is more appropriate).



来源:https://stackoverflow.com/questions/13765699/how-can-i-obtain-all-the-possible-combination-of-a-subset

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!