FileStream.ReadAsync very slow compared to Read()

本小妞迷上赌 提交于 2021-02-07 14:27:23

问题


I have the following code to loop thru a file and read 1024 bytes at a time. The first iteration uses FileStream.Read() and the second iteration uses FileStream.ReadAsync().

private async void Button_Click(object sender, RoutedEventArgs e)
{
    await Task.Run(() => Test()).ConfigureAwait(false);
}

private async Task Test()
{
    Stopwatch sw = new Stopwatch();
    sw.Start();

    int readSize;
    int blockSize = 1024;
    byte[] data = new byte[blockSize];

    string theFile = @"C:\test.mp4";
    long totalRead = 0;

    using (FileStream fs = new FileStream(theFile, FileMode.Open, FileAccess.Read, FileShare.ReadWrite))
    {

        readSize = fs.Read(data, 0, blockSize);

        while (readSize > 0)
        {
            totalRead += readSize;
            readSize = fs.Read(data, 0, blockSize);
        }
    }

    sw.Stop();
    Console.WriteLine($"Read() Took {sw.ElapsedMilliseconds}ms and totalRead: {totalRead}");

    sw.Reset();
    sw.Start();
    totalRead = 0;
    using (FileStream fs = new FileStream(theFile, FileMode.Open, FileAccess.Read, FileShare.ReadWrite, (blockSize*2), FileOptions.Asynchronous | FileOptions.SequentialScan))
    {
        readSize = await fs.ReadAsync(data, 0, blockSize).ConfigureAwait(false);

        while (readSize > 0)
        {
            totalRead += readSize;
            readSize = await fs.ReadAsync(data, 0, blockSize).ConfigureAwait(false);
        }
    }

    sw.Stop();
    Console.WriteLine($"ReadAsync() Took {sw.ElapsedMilliseconds}ms and totalRead: {totalRead}");
}

And the result:

Read() Took 162ms and totalRead: 162835040
ReadAsync() Took 15597ms and totalRead: 162835040

The ReadAsync() is about 100 times slower. Am I missing anything? The only thing I can think of is the overhead to create and destroy task using ReadAsync(), but is the overhead that much?

UPDATE:

I've changed the above code to reflect the suggestion by @Cory. There is a slight improvement:

Read() Took 142ms and totalRead: 162835040 
ReadAsync() Took 12288ms and totalRead: 162835040

When I increase the read block size to 1MB as suggested by @Alexandru, the results are much more acceptable:

Read() Took 32ms and totalRead: 162835040
ReadAsync() Took 76ms and totalRead: 162835040

So, it hinted to me that it is indeed the overhead of the number of tasks which causes the slowness. But, if the creation and destroy of task only takes merely 100µs, things still don't really adds up for the slowness with a small block size.


回答1:


Stick with big buffers if you're doing async and make sure to turn on async mode in the FileStream constructor, and you should be okay. Async methods that you await like this will trap in and out of the current thread (mind you the current thread is the UI thread in your case, which can be lagged by any other async method facilitating the same in and out thread trapping) and so there will be some overhead involved in this process if you have a large number of calls (imagine calling a new thread constructor and awaiting for it to finish about 100K times, and especially if you're dealing with a UI app where you need to wait for the UI thread to be free in order to trap back into it once the async function completes). So, to reduce these calls, we simply read in larger increments of data and focus the application on reading more data at a time by increasing the buffer size. Make sure to test this code in Release mode so all of the compiler optimizations are available to us and also such that the debugger does not slow us down:

class Program
{
    static void Main(string[] args)
    {
        DoStuff();
        Console.ReadLine();
    }

    public static async void DoStuff()
    {
        var filename = @"C:\Example.txt";

        var sw = new Stopwatch();
        sw.Start();
        ReadAllFile(filename);
        sw.Stop();

        Console.WriteLine("Sync: " + sw.Elapsed);

        sw.Restart();
        await ReadAllFileAsync(filename);
        sw.Stop();

        Console.WriteLine("Async: " + sw.Elapsed);
    }

    static void ReadAllFile(string filename)
    {
        byte[] buffer = new byte[131072];
        using (var file = new FileStream(filename, FileMode.Open, FileAccess.Read, FileShare.Read, buffer.Length, false))
            while (true)
                if (file.Read(buffer, 0, buffer.Length) <= 0)
                    break;
    }

    static async Task ReadAllFileAsync(string filename)
    {
        byte[] buffer = new byte[131072];
        using (var file = new FileStream(filename, FileMode.Open, FileAccess.Read, FileShare.Read, buffer.Length, true))
            while (true)
                if ((await file.ReadAsync(buffer, 0, buffer.Length)) <= 0)
                    break;
    }
}

Results:

Sync: 00:00:00.3092809

Async: 00:00:00.5541262

Pretty negligible...the file is about 1 GB.

Let's say I go even bigger, a 1 MB buffer, AKA new byte[1048576] (come on man, everyone has 1 MB of RAM these days):

Sync: 00:00:00.2925763

Async: 00:00:00.3402034

Then its just a few hundredths of a second difference. If you blink, you'll miss it.




回答2:


Your method signature suggests you're doing this from an WPF app. While the blocking code will take up the UI thread during this time, the async code will be forced to go through the UI message queue every time an asynchronous operation completes, slowing it down and competing with any UI messages. You should try removing it from the UI thread like so:

void Button_Click(object sender, RoutedEventArgs e)
{
    Task.Run(() => Button_Click_Impl());
}

async Task Button_Click_Impl()
{
    // put code here.
}

Next, open the file in async mode. If you don't do this, async is emulated and will go much slower:

new FileStream(theFile, FileMode.Open, FileAccess.Read, FileShare.ReadWrite, 4096,
               FileOptions.Asynchronous | FileOptions.SequentialScan)

Finally, you may also be able to extract some small performance using ConfigureAwait(false) to avoid moving between threads:

readSize = await fs.ReadAsync(data, 0, 1024).ConfigureAwait(false);



回答3:


The overhead of a single ReadAsync operation is much higher than of a single Read operation (especially if you do not use the right mode upon opening the file, see other answers). If you eventually end up with the whole file in memory anyway, just query the file's size, allocate a large enough buffer and read all at once. Otherwise, you can still increase the buffer size to e.g. 32 MiB, or even larger if you expect larger file sizes. That should considerably speed up everything.

Only bother with launching a new task if there is considerable CPU-bound work for each block. Otherwise the UI should be kept responsive by the ReadAsync operation (with a sufficiently large buffer) taking their time (if it completes immediately, you may still be blocking the UI, see remarks at Task.Yield()).



来源:https://stackoverflow.com/questions/39353119/filestream-readasync-very-slow-compared-to-read

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!