Severe performance problems when performing a simple query using Entity Framework

若如初见. 提交于 2020-01-14 03:13:08

问题


I've got a fairly generic CRUD webapp, which generates pages dynamically according to the contents of several database tables. I'm using Entity Framework 4.0 to pull this data out of the DB, however I'm running into severe performance problems. I've managed to iterate down into a problem which is contained enough that I can detail below.

I have a table containing list of Page Forms (~200). Each form has one or more Fields (~4000 total), and each field has may have some Parameters (~16000 total).

I've attached a screenshot of my model below:

The associated entity objects are as follows:

public class Form
{
    public int FormID { get; set; }
    public string FormName { get; set; }

    public IList<FormField> FormFields { get; set; }

}

public class FormField
{
    public int FieldID { get; set; }
    public string FieldName { get; set; }
    public int FormID{ get; set; } 

    public IList<FormFieldParameter> FormFieldParameters { get; set; }
    public Form ParentForm { get; set; }

}

public class FormFieldParameter
{
    public int FieldParamID{ get; set; }
    public string Value{ get; set; }
    public int? FieldID { get; set; }

    public FormField ParentField { get; set; }
}

The following code pulls out all data for the Form which has an ID of '1'.

EntityConnection myConnection = new EntityConnection("name=myModel");

if(conn.State != ConnectionState.Open) {
    conn.Open();
}
ObjectContext context = new ObjectContext("name=myModel");
context.ContextOptions.LazyLoadingEnabled = false;

ObjectQuery<PageForm> myObjectSet = context.CreateObjectSet<PageForm>()
                                           .Include("FormField.FormFieldParameter");

//Edit: I missed this part out, sorry. In hindsight, this was exactly what was
//causing the issue.
IEnumerable<PageForm> myObjectSetEnumerable = myObjectSet.AsEnumerable();
IQueryable<PageForm> myFilteredObjectSet = myObjectSetEnumerable.Where(c => c.FormID == 1)
                                                                .AsQueryable();


List<PageForm> myReturnValue = myFilteredObjectSet.toList();

Now, while this does work, it runs really poorly. The query takes over a second to run, the entirety of which is spent in the myFilteredObjectSet.toList() call. I ran a profiler on my database to see what was causing the delay, and found that the following query was being generated:

SELECT 
[Project1].[FormID] AS [FormID], 
[Project1].[FormName] AS [FormName], 
[Project1].[C2] AS [C1], 
[Project1].[FormID1] AS [FormID1], 
[Project1].[FieldID] AS [FieldID], 
[Project1].[FieldName] AS [FieldName], 
[Project1].[C1] AS [C2], 
[Project1].[FieldParamID] AS [FieldParamID], 
[Project1].[Value] AS [Value], 
[Project1].[FieldID1] AS [FieldID1]
FROM ( SELECT 
    [Extent1].[FormID] AS [FormID], 
    [Extent1].[FormName] AS [FormName], 
    [Join1].[FieldID] AS [FieldID], 
    [Join1].[FieldName] AS [FieldName], 
    [Join1].[FormID] AS [FormID1], 
    [Join1].[FieldParamID] AS [FieldParamID], 
    [Join1].[Value] AS [Value], 
    [Join1].[FieldID1] AS [FieldID1], 
    CASE WHEN ([Join1].[FieldID] IS NULL) THEN CAST(NULL AS int) WHEN ([Join1].[FieldParamID] IS NULL) THEN CAST(NULL AS int) ELSE 1 END AS [C1], 
    CASE WHEN ([Join1].[FieldID] IS NULL) THEN CAST(NULL AS int) ELSE 1 END AS [C2]
    FROM  [dbo].[PageForm] AS [Extent1]
    LEFT OUTER JOIN  (SELECT [Extent2].[FieldID] AS [FieldID], [Extent2].[FieldName] AS [FieldName], [Extent2].[FormID] AS [FormID], [Extent3].[FieldParamID] AS [FieldParamID], [Extent3].[Value] AS [Value], [Extent3].[FieldID] AS [FieldID1]
        FROM  [dbo].[FormField] AS [Extent2]
        LEFT OUTER JOIN [dbo].[FormFieldParameter] AS [Extent3] ON [Extent2].[FieldID] = [Extent3].[FieldID] ) AS [Join1] ON [Extent1].[FormID] = [Join1].[FormID]
)  AS [Project1]
ORDER BY [Project1].[FormID] ASC, [Project1].[C2] ASC, [Project1].[FieldID] ASC, [Project1].[C1] ASC

The duration of this query shown on the sql profiler shows that this query is what is taking so long to run. The interesting thing about the query, is that there is no filtering on it at all - It is returning the entire tree! I can't understand why it is returning everything, as the filter myObjectSet.Where(c => c.FormID == 1) is pretty explicit. The actual returned object only contains a single entry, which I would expect.

I'm having this problem across my entire data access layer, and its performance is appalling. I have no idea why the generated query doesn't contain the filter - and no idea how to tell it to do so. Does anybody know the answer?


回答1:


TL;DR Remove the AsEnumerable call and replace it with an AsQueryable call and it should resolve most of the performance issues (outside of actual database execution cost being slow, which is fixed by adding indexes on columns you are filtering / joining on).

Explanation of what is actually happening...

As soon as you call AsEnumerable you are now outside of Entity Framework and in the world of LINQ-to-objects. That means it's going to execute the query against the database when it is enumerated against. It doesn't matter that you call AsQueryable again, that merely means that you are creating a query against an in memory structure.

The effective execution is this.

  1. Create an object query, including all FormFieldProperties linked to the form
  2. Transform the current IQueryable instance into an enumerable.
  3. Add a predicate against the enumerable instance which will only return items whose FormID value is one.
  4. Call ToList, which copies all values from source enumerable to a list.

Now, up until step 4, the query actually hasn't queried the database. When you call ToList, it executes the query in step one (as you see). This query likely is expensive and takes a while because of the amount of data it is returning and/or missing indexes that may improve it's performance.

Once that query is done and materialized, it's result is wrapped in an enumerator.

Now, every object is iterated and checked to see if it matches the predicate that was added in step 3. If it does match, then it is returned to whoever is iterating over it (in this case, the ToList function).

Now that the value has been returned, it is added to the list that is being created with the values.

Finally, you get a list back from the ToList method, and it has exactly what you asked for, but it did all of that in memory rather than in the database.



来源:https://stackoverflow.com/questions/11727534/severe-performance-problems-when-performing-a-simple-query-using-entity-framewor

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!