Very slow foreach loop

☆樱花仙子☆ 提交于 2019-12-23 08:51:43

问题


I am working on an existing application. This application reads data from a huge file and then, after doing some calculations, it stores the data in another table.

But the loop doing this (see below) is taking a really long time. Since the file sometimes contains 1,000s of records, the entire process takes days.

Can I replace this foreach loop with something else? I tried using Parallel.ForEach and it did help. I am new to this, so will appreciate your help.

foreach (record someredord Somereport.r)
{
    try
    {
        using (var command = new SqlCommand("[procname]", sqlConn))
        {
            command.CommandTimeout = 0;
            command.CommandType = CommandType.StoredProcedure;
            command.Parameters.Add(…);

            IAsyncResult result = command.BeginExecuteReader();
            while (!result.IsCompleted)
            {
                System.Threading.Thread.Sleep(10);
            }
            command.EndExecuteReader(result);
        }
    }
    catch (Exception e)
    {
        …
    }
}

After reviewing the answers , I removed the Async and used edited the code as below. But this did not improve performance.

using (command = new SqlCommand("[sp]", sqlConn))
{
    command.CommandTimeout = 0;
    command.CommandType = CommandType.StoredProcedure;
    foreach (record someRecord in someReport.)
    {
        command.Parameters.Clear();
        command.Parameters.Add(....)
        command.Prepare();                            

        using (dr = command.ExecuteReader())
        {
            while (dr.Read())
            {
                if ()
                {

                }
                else if ()
                {

                }
            }
        }                             
    }                        
}

回答1:


Instead of looping the sql connection so many times, ever consider extracting the whole set of data out from sql server and process the data via the dataset?

Edit: Decided to further explain what i meant.. You can do the following, pseudo code as follow

  1. Use a select * and get all information from the database and store them into a list of the class or dictionary.
  2. Do your foreach(record someRecord in someReport) and do the condition matching as usual.



回答2:


Step 1: Ditch the try at async. It isn't implemented properly and you're blocking anyway. So just execute the procedure and see if that helps.

Step 2: Move the SqlCommand outside of the loop and reuse it for each iteration. that way you don't incurr the cost of creating and destroying it for every item in your loop.

Warning: Make sure you reset/clear/remove parameters you don't need from the previous iteration. We did something like this with optional parameters and had 'bleed-thru' from the previous iteration because we didn't clean up parameters we didn't need!




回答3:


Your biggest problem is that you're looping over this:

IAsyncResult result = command.BeginExecuteReader();

while (!result.IsCompleted)
{
   System.Threading.Thread.Sleep(10);
}

command.EndExecuteReader(result);

The entire idea of the asynchronous model is that the calling thread (the one doing this loop) should be spinning up ALL of the asynchronous tasks using the Begin method before starting to work with the results with the End method. If you are using Thread.Sleep() within your main calling thread to wait for an asynchronous operation to complete (as you are here), you're doing it wrong, and what ends up happening is that each command, one at a time, is being called and then waited for before the next one starts.

Instead, try something like this:

public void BeginExecutingCommands(Report someReport)
{
    foreach (record someRecord in someReport.r) 
    {
        var command = new SqlCommand("[procname]", sqlConn);

        command.CommandTimeout = 0;
        command.CommandType = CommandType.StoredProcedure;
        command.Parameters.Add(…);

        command.BeginExecuteReader(ReaderExecuted, 
            new object[] { command, someReport, someRecord });                   
    }
}

void ReaderExecuted(IAsyncResult result)
{
    var state = (object[])result.AsyncState;
    var command = state[0] as SqlCommand;
    var someReport = state[1] as Report;
    var someRecord = state[2] as Record;

    try
    {
        using (SqlDataReader reader = command.EndExecuteReader(result))
        {
            // work with reader, command, someReport and someRecord to do what you need.
        }
    }
    catch (Exception ex)
    {
        // handle exceptions that occurred during the async operation here
    }
}



回答4:


In SQL on the other end of a write is a (one) disk. You rarely can write faster in parallel. In fact in parallel often slows it down due to index fragmentation. If you can sort the data by primary (clustered) key prior to loading. In a big load even disable other keys, load data rebuild keys.

Not really sure what are doing in the asynch but for sure it was not doing what you expected as it was waiting on itself.

try
{
    using (var command = new SqlCommand("[procname]", sqlConn))
    {
        command.CommandTimeout = 0;
        command.CommandType = CommandType.StoredProcedure;

        foreach (record someredord Somereport.r)
        {
            command.Parameters.Clear()
            command.Parameters.Add(…);

            using (var rdr = command.ExecuteReader())
            {
                while (rdr.Read())
                {
                    …
                }
            }
        }
    }
}
catch (…)
{
    …
}



回答5:


As we were talking about in the comments, storing this data in memory and working with it there may be a more efficient approach.

So one easy way to do that is to start with Entity Framework. Entity Framework will automatically generate the classes for you based on your database schema. Then you can import a stored procedure which holds your SELECT statement. The reason I suggest importing a stored proc into EF is that this approach is generally more efficient than doing your queries in LINQ against EF.

Then run the stored proc and store the data in a List like this...

var data = db.MyStoredProc().ToList();

Then you can do anything you want with that data. Or as I mentioned, if you're doing a lot of lookups on primary keys then use ToDictionary() something like this...

var data = db.MyStoredProc().ToDictionary(k => k.MyPrimaryKey);

Either way, you'll be working with your data in memory at this point.




回答6:


It seems executing your SQL command puts lock on some required resources and that's the reason enforced you to use Async methods (my guess).

If the database in not in use, try an exclusive access to it. Even then in there are some internal transactions due to data-model complexity consider consulting to database designer.



来源:https://stackoverflow.com/questions/12202137/very-slow-foreach-loop

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!