Entity Framework Include OrderBy random generates duplicate data

后端 未结 6 592
[愿得一人]
[愿得一人] 2020-11-30 02:29

When I retrieve a list of items from a database including some children (via .Include), and order the randomly, EF gives me an unexpected result.. I creates/clones addition

6条回答
  •  挽巷
    挽巷 (楼主)
    2020-11-30 02:31

    tl;dr: There's a leaky abstraction here. To us, Include is a simple instruction to stick a collection of things onto each single returned Person row. But EF's implementation of Include is done by returning a whole row for each Person-Address combo, and reassembling at the client. Ordering by a volatile value causes those rows to become shuffled, breaking apart the Person groups that EF is relying on.


    When we have a look at ToTraceString() for this LINQ:

     var people = c.People.Include("Addresses");
     // Note: no OrderBy in sight!
    

    we see

    SELECT 
    [Project1].[Id] AS [Id], 
    [Project1].[Name] AS [Name], 
    [Project1].[C1] AS [C1], 
    [Project1].[Id1] AS [Id1], 
    [Project1].[Data] AS [Data], 
    [Project1].[PersonId] AS [PersonId]
    FROM ( SELECT 
        [Extent1].[Id] AS [Id], 
        [Extent1].[Name] AS [Name], 
        [Extent2].[Id] AS [Id1], 
        [Extent2].[PersonId] AS [PersonId], 
        [Extent2].[Data] AS [Data], 
        CASE WHEN ([Extent2].[Id] IS NULL) THEN CAST(NULL AS int) ELSE 1 END AS [C1]
        FROM  [Person] AS [Extent1]
        LEFT OUTER JOIN [Address] AS [Extent2] ON [Extent1].[Id] = [Extent2].[PersonId]
    )  AS [Project1]
    ORDER BY [Project1].[Id] ASC, [Project1].[C1] ASC
    

    So we get n rows for each A, plus 1 row for each P without any As.

    Adding an OrderBy clause, however, puts the thing-to-order-by at the start of the ordered columns:

    var people = c.People.Include("Addresses").OrderBy(p => Guid.NewGuid());
    

    gives

    SELECT 
    [Project1].[Id] AS [Id], 
    [Project1].[Name] AS [Name], 
    [Project1].[C2] AS [C1], 
    [Project1].[Id1] AS [Id1], 
    [Project1].[Data] AS [Data], 
    [Project1].[PersonId] AS [PersonId]
    FROM ( SELECT 
        NEWID() AS [C1], 
        [Extent1].[Id] AS [Id], 
        [Extent1].[Name] AS [Name], 
        [Extent2].[Id] AS [Id1], 
        [Extent2].[PersonId] AS [PersonId], 
        [Extent2].[Data] AS [Data], 
        CASE WHEN ([Extent2].[Id] IS NULL) THEN CAST(NULL AS int) ELSE 1 END AS [C2]
        FROM  [Person] AS [Extent1]
        LEFT OUTER JOIN [Address] AS [Extent2] ON [Extent1].[Id] = [Extent2].[PersonId]
    )  AS [Project1]
    ORDER BY [Project1].[C1] ASC, [Project1].[Id] ASC, [Project1].[C2] ASC
    

    So in your case, where the ordered-by-thing is not a property of a P, but is instead volatile, and therefore can be different for different P-A records of the same P, the whole thing falls apart.


    I'm not sure where on the working-as-intended ~~~ cast-iron bug continuum this behaviour falls. But at least now we know about it.

提交回复
热议问题