C# fastest intersection of 2 sets of sorted numbers

前端 未结 5 1430
傲寒
傲寒 2020-12-28 10:00

I\'m calculating intersection of 2 sets of sorted numbers in a time-critical part of my application. This calculation is the biggest bottleneck of the whole application so I

5条回答
  •  旧巷少年郎
    2020-12-28 10:04

    I was using Jon's approach but needed to execute this intersect hundreds of thousands of times for a bulk operation on very large sets and needed more performance. The case I was running in to was heavily imbalanced sizes of the lists (eg 5 and 80,000) and wanted to avoid iterating the entire large list.

    I found that detecting the imbalance and changing to an alternate algorithm gave me huge benifits over specific data sets:

    public static IEnumerable IntersectSorted(this List sequence1,
            List sequence2,
            IComparer comparer)
    {
        List smallList = null;
        List largeList = null;
    
        if (sequence1.Count() < Math.Log(sequence2.Count(), 2))
        {
            smallList = sequence1;
            largeList = sequence2;
        }
        else if (sequence2.Count() < Math.Log(sequence1.Count(), 2))
        {
            smallList = sequence2;
            largeList = sequence1;
        }
    
        if (smallList != null)
        {
            foreach (var item in smallList)
            {
                if (largeList.BinarySearch(item, comparer) >= 0)
                {
                    yield return item;
                }
            }
        }
        else
        {
            //Use Jon's method
        }
    }
    

    I am still unsure about the point at which you break even, need to do some more testing

提交回复
热议问题