C# fastest intersection of 2 sets of sorted numbers

前端未结

关注

 5  1430

傲寒 2020-12-28 10:00

I\'m calculating intersection of 2 sets of sorted numbers in a time-critical part of my application. This calculation is the biggest bottleneck of the whole application so I

5条回答

旧巷少年郎 (楼主)

2020-12-28 10:04

I was using Jon's approach but needed to execute this intersect hundreds of thousands of times for a bulk operation on very large sets and needed more performance. The case I was running in to was heavily imbalanced sizes of the lists (eg 5 and 80,000) and wanted to avoid iterating the entire large list.

I found that detecting the imbalance and changing to an alternate algorithm gave me huge benifits over specific data sets:

public static IEnumerable IntersectSorted(this List sequence1,
        List sequence2,
        IComparer comparer)
{
    List smallList = null;
    List largeList = null;

    if (sequence1.Count() < Math.Log(sequence2.Count(), 2))
    {
        smallList = sequence1;
        largeList = sequence2;
    }
    else if (sequence2.Count() < Math.Log(sequence1.Count(), 2))
    {
        smallList = sequence2;
        largeList = sequence1;
    }

    if (smallList != null)
    {
        foreach (var item in smallList)
        {
            if (largeList.BinarySearch(item, comparer) >= 0)
            {
                yield return item;
            }
        }
    }
    else
    {
        //Use Jon's method
    }
}

I am still unsure about the point at which you break even, need to do some more testing

0 讨论(0)

查看其它5个回答