Excluding large List<int> from LINQ to Entities query

佐手、 提交于 2019-12-14 04:19:54

问题


I have a List containing a high number of items - up to 10,000.

I am looking for the most efficient way to exclude these from an IQueryable/List.

Due to the complexity of the process involved in obtaining this list of Ids, it isn't possible to do this within a query.

The example below has an extremely high overhead and wondered if anybody might be able to explain possible reasons for this and if there's a better way to achieve this?

results = from q1 in results 
  where excludedRecords.All(x => x != q1.ItemId)
  select q1;

回答1:


This is just a fragment of the code, but it looks like you have two lists - results and excludedRecords. For each element in results you iterate over all the elements in excludedRecords. This is why it is slow, it is O(N x M)

Linq and sql solve this with joining, if you join (or the equivalent) you should see some nice performance since that will me something like O(NlgM)

It would look something like this (can't test it right now)

var results2 = from q1 in results
                join x in excludedRecords on q1.LeadID = x into joined
                from z in joined.DefaultIfEmpty()
                where z == null
                select q1;



回答2:


From the shape of your query, I take excludedRecords to be a list of integers. Further, since you tag LINQ to Entities, I take results to be a DbSet in a DbContext.

This is the problem of combining local lists (excludedRecords) with an IQueryable that waits to be translated into SQL (results). For EF to be able to translate the complete expression (your query) into SQL, it has to translate this local list into "something" that can be part of a SQL statement. With All(), and many other set-based LINQ statements, and when joining the local list, EF does this by building a temp table (of sorts) from single-row tables. With only 5 elements in the local list, this looks like

SELECT ...
    FROM [dbo].[Table] AS [Extent1]
    WHERE  EXISTS (SELECT 
        1 AS [C1]
        FROM  (SELECT 
            1 AS [C1]
            FROM  ( SELECT 1 AS X ) AS [SingleRowTable1]
        UNION ALL
            SELECT 
            2 AS [C1]
            FROM  ( SELECT 1 AS X ) AS [SingleRowTable2]
        UNION ALL
            SELECT 
            3 AS [C1]
            FROM  ( SELECT 1 AS X ) AS [SingleRowTable3]
        UNION ALL
            SELECT 
            4 AS [C1]
            FROM  ( SELECT 1 AS X ) AS [SingleRowTable4]
        UNION ALL
            SELECT 
            5 AS [C1]
            FROM  ( SELECT 1 AS X ) AS [SingleRowTable5]) AS [UnionAll4]
        WHERE ([Extent1].[Id] = [UnionAll4].[C1]) OR (CASE WHEN ([Extent1].[Id] <> [UnionAll4].[C1]) THEN cast(1 as bit) WHEN ([Extent1].[Id] = [UnionAll4].[C1]) THEN cast(0 as bit) END IS NULL)
    )

Although this potentially generates huge SQL statements, it's still workable when the local list doesn't contain "too many" elements (let's say, up to 1000).

The only statement that allows EF to use the local list more efficiently is Contains. Contains can easily be translated into a SQL IN statement. If we rewrite your query to the equivalent with Contains, which is also the answer to your question, ...

results = from q1 in results 
          where !excludedRecords.Contains(q1.ItemId)
          select q1;

... the SQL query will look like

SELECT ...
    FROM [dbo].[Table] AS [Extent1]
    WHERE  NOT ([Extent1].[Id] IN (1, 2, 3, 4, 5))

The IN statement can handle more elements than this "temp table", although this number is still limited (maybe 3000).



来源:https://stackoverflow.com/questions/20573342/excluding-large-listint-from-linq-to-entities-query

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!