问题
I have 3 files, each 1 million rows long and I'm reading them line by line. No processing, just reading as I'm just trialling things out.
If I do this synchronously it takes 1 second. If I switch to using Threads, one for each file, it is slightly quicker (code not below, but I simply created a new Thread and started it for each file).
When I change to async, it is taking 40 times as long at 40 seconds. If I add in any work to do actual processing, I cannot see how I'd ever use async over synchronous or if I wanted a responsive application using Threads.
Or am I doing something fundamentally wrong with this code and not as async was intended?
Thanks.
class AsyncTestIOBound
{
Stopwatch sw = new Stopwatch();
internal void Tests()
{
DoSynchronous();
DoASynchronous();
}
#region sync
private void DoSynchronous()
{
sw.Restart();
var start = sw.ElapsedMilliseconds;
Console.WriteLine($"Starting Sync Test");
DoSync("Addresses", "SampleLargeFile1.txt");
DoSync("routes ", "SampleLargeFile2.txt");
DoSync("Equipment", "SampleLargeFile3.txt");
sw.Stop();
Console.WriteLine($"Ended Sync Test. Took {(sw.ElapsedMilliseconds - start)} mseconds");
Console.ReadKey();
}
private long DoSync(string v, string filename)
{
string line;
long counter = 0;
using (StreamReader file = new StreamReader(filename))
{
while ((line = file.ReadLine()) != null)
{
counter++;
}
}
Console.WriteLine($"{v}: T{Thread.CurrentThread.ManagedThreadId}: Lines: {counter}");
return counter;
}
#endregion
#region async
private void DoASynchronous()
{
sw.Restart();
var start = sw.ElapsedMilliseconds;
Console.WriteLine($"Starting Sync Test");
Task a=DoASync("Addresses", "SampleLargeFile1.txt");
Task b=DoASync("routes ", "SampleLargeFile2.txt");
Task c=DoASync("Equipment", "SampleLargeFile3.txt");
Task.WaitAll(a, b, c);
sw.Stop();
Console.WriteLine($"Ended Sync Test. Took {(sw.ElapsedMilliseconds - start)} mseconds");
Console.ReadKey();
}
private async Task<long> DoASync(string v, string filename)
{
string line;
long counter = 0;
using (StreamReader file = new StreamReader(filename))
{
while ((line = await file.ReadLineAsync()) != null)
{
counter++;
}
}
Console.WriteLine($"{v}: T{Thread.CurrentThread.ManagedThreadId}: Lines: {counter}");
return counter;
}
#endregion
}
回答1:
Since you are using await
several times in a giant loop (in your case, looping through each line of a "SampleLargeFile"), you are doing a lot of context switching, and the overhead can be really bad.
For each line, your code maybe is switching between each file. If your computer uses a hard drive, this can get even worse. Imagine the head of your HD getting crazy.
When you use normal threads, you are not switching the context for each line.
To solve this, just read the file on a single run. You can still use async/await
(ReadToEndAsync()
) and get a good performance.
EDIT
So, you are trying to count lines on the text file using async, right?
Try this (no need to load the entire file in memory):
private async Task<int> CountLines(string path)
{
int count = 0;
await Task.Run(() =>
{
using (FileStream fs = File.Open(path, FileMode.Open, FileAccess.Read, FileShare.ReadWrite))
using (BufferedStream bs = new BufferedStream(fs))
using (StreamReader sr = new StreamReader(bs))
{
while (sr.ReadLine() != null)
{
count++;
}
}
});
return count;
}
回答2:
a few things. First I would read all lines at once in the async method so that you are only awaiting once (instead of per line).
private async Task<long> DoASync(string v, string filename)
{
string lines;
long counter = 0;
using (StreamReader file = new StreamReader(filename))
{
lines = await reader.ReadToEndAsync();
}
Console.WriteLine($"{v}: T{Thread.CurrentThread.ManagedThreadId}: Lines: {lines.Split('\n').Length}");
return counter;
}
next, you can also wait for each Task individually. This will cause your CPU to only focus on one at a time, instead of possibly switching between the 3, which will cause more overhead.
private async void DoASynchronous()
{
sw.Restart();
var start = sw.ElapsedMilliseconds;
Console.WriteLine($"Starting Sync Test");
await DoASync("Addresses", "SampleLargeFile1.txt");
await DoASync("routes ", "SampleLargeFile2.txt");
await DoASync("Equipment", "SampleLargeFile3.txt");
sw.Stop();
Console.WriteLine($"Ended Sync Test. Took {(sw.ElapsedMilliseconds - start)} mseconds");
Console.ReadKey();
}
The reason why you are seeing slower performance is due to how await works with the CPU load. For each new line, this will cause an increase of CPU usage. Async machinery adds processing, allocations and synchronization. Also, we need to transition to kernel mode two times instead of once (first to initiate the IO, then to dequeue the IO completion notification).
More info, see: Does async await increases Context switching
来源:https://stackoverflow.com/questions/54753339/async-file-reading-40-times-slower-than-synchronous-or-manual-threads