问题
I have a List containing a high number of items - up to 10,000.
I am looking for the most efficient way to exclude these from an IQueryable/List.
Due to the complexity of the process involved in obtaining this list of Ids, it isn't possible to do this within a query.
The example below has an extremely high overhead and wondered if anybody might be able to explain possible reasons for this and if there's a better way to achieve this?
results = from q1 in results
where excludedRecords.All(x => x != q1.ItemId)
select q1;
回答1:
This is just a fragment of the code, but it looks like you have two lists - results and excludedRecords. For each element in results you iterate over all the elements in excludedRecords. This is why it is slow, it is O(N x M)
Linq and sql solve this with joining, if you join (or the equivalent) you should see some nice performance since that will me something like O(NlgM)
It would look something like this (can't test it right now)
var results2 = from q1 in results
join x in excludedRecords on q1.LeadID = x into joined
from z in joined.DefaultIfEmpty()
where z == null
select q1;
回答2:
From the shape of your query, I take excludedRecords
to be a list of integers. Further, since you tag LINQ to Entities, I take results
to be a DbSet
in a DbContext
.
This is the problem of combining local lists (excludedRecords
) with an IQueryable
that waits to be translated into SQL (results
). For EF to be able to translate the complete expression (your query) into SQL, it has to translate this local list into "something" that can be part of a SQL statement. With All()
, and many other set-based LINQ statements, and when joining the local list, EF does this by building a temp table (of sorts) from single-row tables. With only 5 elements in the local list, this looks like
SELECT ...
FROM [dbo].[Table] AS [Extent1]
WHERE EXISTS (SELECT
1 AS [C1]
FROM (SELECT
1 AS [C1]
FROM ( SELECT 1 AS X ) AS [SingleRowTable1]
UNION ALL
SELECT
2 AS [C1]
FROM ( SELECT 1 AS X ) AS [SingleRowTable2]
UNION ALL
SELECT
3 AS [C1]
FROM ( SELECT 1 AS X ) AS [SingleRowTable3]
UNION ALL
SELECT
4 AS [C1]
FROM ( SELECT 1 AS X ) AS [SingleRowTable4]
UNION ALL
SELECT
5 AS [C1]
FROM ( SELECT 1 AS X ) AS [SingleRowTable5]) AS [UnionAll4]
WHERE ([Extent1].[Id] = [UnionAll4].[C1]) OR (CASE WHEN ([Extent1].[Id] <> [UnionAll4].[C1]) THEN cast(1 as bit) WHEN ([Extent1].[Id] = [UnionAll4].[C1]) THEN cast(0 as bit) END IS NULL)
)
Although this potentially generates huge SQL statements, it's still workable when the local list doesn't contain "too many" elements (let's say, up to 1000).
The only statement that allows EF to use the local list more efficiently is Contains
. Contains
can easily be translated into a SQL IN
statement. If we rewrite your query to the equivalent with Contains
, which is also the answer to your question, ...
results = from q1 in results
where !excludedRecords.Contains(q1.ItemId)
select q1;
... the SQL query will look like
SELECT ...
FROM [dbo].[Table] AS [Extent1]
WHERE NOT ([Extent1].[Id] IN (1, 2, 3, 4, 5))
The IN
statement can handle more elements than this "temp table", although this number is still limited (maybe 3000).
来源:https://stackoverflow.com/questions/20573342/excluding-large-listint-from-linq-to-entities-query