Faster alternative to nested loops?

后端 未结 12 1446
栀梦
栀梦 2020-12-12 18:35

I have a need to create a list of combinations of numbers. The numbers are quite small so I can use byte rather than int. However it requires many

相关标签:
12条回答
  • 2020-12-12 19:24

    Some of your numbers fit entirely on an integer nuimber of bits, so you can "pack" them with the upper level number :

    for (byte lm = 0; lm < 12; lm++)
    {
        ...
        t[z].l = (lm&12)>>2;
        t[z].m = lm&3;
        ...
    }
    

    Of course, this makes the code less readable, but you saved one loop. This can be done each time one of the numbers is a power of two, which is seven time in your case.

    0 讨论(0)
  • 2020-12-12 19:28
    var numbers = new[] { 2, 3, 4, 3, 4, 3, 3, 4, 2, 4, 4, 3, 4 };
    var result = (numbers.Select(i => Enumerable.Range(0, i))).CartesianProduct();
    

    Using the extension method at http://ericlippert.com/2010/06/28/computing-a-cartesian-product-with-linq/

    public static IEnumerable<IEnumerable<T>> CartesianProduct<T>(this IEnumerable<IEnumerable<T>> sequences)
    {
        // base case: 
        IEnumerable<IEnumerable<T>> result =
            new[] { Enumerable.Empty<T>() };
        foreach (var sequence in sequences)
        {
            // don't close over the loop variable (fixed in C# 5 BTW)
            var s = sequence;
            // recursive case: use SelectMany to build 
            // the new product out of the old one 
            result =
                from seq in result
                from item in s
                select seq.Concat(new[] { item });
        }
        return result;
    }
    
    0 讨论(0)
  • 2020-12-12 19:30

    You can use the properties of a struct and allocate the structure in advance. I cut off some levels in the sample below, but I'm sure you'll be able to figure out the specifics. Runs about 5-6 times faster than the original (release mode).

    The block:

    struct ByteBlock
    {
        public byte A;
        public byte B;
        public byte C;
        public byte D;
        public byte E;
    }
    

    The loop:

    var data = new ByteBlock[2*3*4*3*4];
    var counter = 0;
    
    var bytes = new ByteBlock();
    
    for (byte a = 0; a < 2; a++)
    {
        bytes.A = a;
        for (byte b = 0; b < 3; b++)
        {
            bytes.B = b;
            for (byte c = 0; c < 4; c++)
            {
                bytes.C = c;
                for (byte d = 0; d < 3; d++)
                {
                    bytes.D = d;
                    for (byte e = 0; e < 4; e++)
                    {
                        bytes.E = e;
                        data[counter++] = bytes;
                    }
                }
            }
        }
    }
    

    It's faster because it doesn't allocate a new list every time you add it to the list. Also since it's creating this list, it needs a reference to every other value (a,b,c,d,e). You can assume each value is only modified once inside the loop, so we can optimize it to do so (data locality).

    Also read the comments for side-effects.

    Edited the answer to use an T[] instead of a List<T>.

    0 讨论(0)
  • 2020-12-12 19:32

    What you are doing is counting (with variable radix, but still counting).

    Since you are using C#, I assume you don't want to play with useful memory layout and data structures that let you really optimize your code.

    So here I'm posting something different, which may not suit your case, but it's worth noting: In case you actually access the list in a sparse fashion, here a class that let you compute the i-th element in linear time (rather than exponential as the other answers)

    class Counter
    {
        public int[] Radices;
    
        public int[] this[int n]
        {
            get 
            { 
                int[] v = new int[Radices.Length];
                int i = Radices.Length - 1;
    
                while (n != 0 && i >= 0)
                {
                    //Hope C# has an IL-opcode for div-and-reminder like x86 do
                    v[i] = n % Radices[i];
                    n /= Radices[i--];
                }
                return v;
            }
        }
    }
    

    You can use this class this way

    Counter c = new Counter();
    c.Radices = new int[] { 2,3,4,3,4,3,3,4,2,4,4,3,4};
    

    now c[i] is the same as your list, name it l, l[i].

    As you can see, you can easily avoid all those loops :) even when you pre compute all the list at whole since you can simply implement a Carry-Ripple counter.

    Counters are a very studied subject, I strongly advice to search for some literature if you feel.

    0 讨论(0)
  • 2020-12-12 19:35

    Method 1

    One way to make it faster is to specify the capacity if you plan to keep using List<byte[]>, like this.

    var data = new List<byte[]>(2 * 3 * 4 * 3 * 4 * 3 * 3 * 4 * 2 * 4 * 4 * 3 * 4);
    

    Method 2

    Furthermore, you could use System.Array directly to gain faster access. I recommend this approach if your question insists that every element be physically populated in memory, upfront.

    var data = new byte[2 * 3 * 4 * 3 * 4 * 3 * 3 * 4 * 2 * 4 * 4 * 3 * 4][];
    int counter = 0;
    
    for (byte a = 0; a < 2; a++)
        for (byte b = 0; b < 3; b++)
            for (byte c = 0; c < 4; c++)
                for (byte d = 0; d < 3; d++)
                    for (byte e = 0; e < 4; e++)
                        for (byte f = 0; f < 3; f++)
                            for (byte g = 0; g < 3; g++)
                                for (byte h = 0; h < 4; h++)
                                    for (byte i = 0; i < 2; i++)
                                        for (byte j = 0; j < 4; j++)
                                            for (byte k = 0; k < 4; k++)
                                                for (byte l = 0; l < 3; l++)
                                                    for (byte m = 0; m < 4; m++)
                                                        data[counter++] = new[] { a, b, c, d, e, f, g, h, i, j, k, l, m };
    

    This takes 596ms to complete on my computer, which is about 10.4% faster than the code in question (which takes 658ms).

    Method 3

    Alternatively, you can use the following technique for low cost initialization that suits access in a sparse fashion. This is especially favorable when only some elements may be needed and determining them all upfront is considered unnecessary. Moreover, techniques like these may become the only viable option when working with more vast elements when memory runs short.

    In this implementation every element is left to be determined lazily, on-the-fly, upon access. Naturally, this comes at a cost of additional CPU that is incurred during access.

    class HypotheticalBytes
    {
        private readonly int _c1, _c2, _c3, _c4, _c5, _c6, _c7, _c8, _c9, _c10, _c11, _c12;
        private readonly int _t0, _t1, _t2, _t3, _t4, _t5, _t6, _t7, _t8, _t9, _t10, _t11;
    
        public int Count
        {
            get { return _t0; }
        }
    
        public HypotheticalBytes(
            int c0, int c1, int c2, int c3, int c4, int c5, int c6, int c7, int c8, int c9, int c10, int c11, int c12)
        {
            _c1 = c1;
            _c2 = c2;
            _c3 = c3;
            _c4 = c4;
            _c5 = c5;
            _c6 = c6;
            _c7 = c7;
            _c8 = c8;
            _c9 = c9;
            _c10 = c10;
            _c11 = c11;
            _c12 = c12;
            _t11 = _c12 * c11;
            _t10 = _t11 * c10;
            _t9 = _t10 * c9;
            _t8 = _t9 * c8;
            _t7 = _t8 * c7;
            _t6 = _t7 * c6;
            _t5 = _t6 * c5;
            _t4 = _t5 * c4;
            _t3 = _t4 * c3;
            _t2 = _t3 * c2;
            _t1 = _t2 * c1;
            _t0 = _t1 * c0;
        }
    
        public byte[] this[int index]
        {
            get
            {
                return new[]
                {
                    (byte)(index / _t1),
                    (byte)((index / _t2) % _c1),
                    (byte)((index / _t3) % _c2),
                    (byte)((index / _t4) % _c3),
                    (byte)((index / _t5) % _c4),
                    (byte)((index / _t6) % _c5),
                    (byte)((index / _t7) % _c6),
                    (byte)((index / _t8) % _c7),
                    (byte)((index / _t9) % _c8),
                    (byte)((index / _t10) % _c9),
                    (byte)((index / _t11) % _c10),
                    (byte)((index / _c12) % _c11),
                    (byte)(index % _c12)
                };
            }
        }
    }
    

    This takes 897ms to complete on my computer (also creating & adding to an Array as in Method 2), which is about a 36.3% slower than the code in question (which takes 658ms).

    0 讨论(0)
  • 2020-12-12 19:35

    On my machine, this generates the combinations in 222 ms vs 760 ms (the 13 for loops):

    private static byte[,] GenerateCombinations(byte[] maxNumberPerLevel)
    {
        var levels = maxNumberPerLevel.Length;
    
        var periodsPerLevel = new int[levels];
        var totalItems = 1;
        for (var i = 0; i < levels; i++)
        {
            periodsPerLevel[i] = totalItems;
            totalItems *= maxNumberPerLevel[i];
        }
    
        var results = new byte[totalItems, levels];
    
        Parallel.For(0, levels, level =>
        {
            var periodPerLevel = periodsPerLevel[level];
            var maxPerLevel = maxNumberPerLevel[level];
            for (var i = 0; i < totalItems; i++)
                results[i, level] = (byte)(i / periodPerLevel % maxPerLevel);
        });
    
        return results;
    }
    
    0 讨论(0)
提交回复
热议问题