non-blocking IO vs async IO and implementation in Java

前端 未结 5 1231
挽巷
挽巷 2020-11-30 17:26

Trying to summarize for myself the difference between these 2 concepts (because I\'m really confused when I see people are using both of them in one sentence, like \"non-blo

5条回答
  •  抹茶落季
    2020-11-30 17:32

    Synchronous vs. asynchronous

    Asynchronous is a relative term that applies to all kinds of computation, not just IO. Something can not be asynchronous by itself but always to something else. Usually, asynchronicity means that some operation is happening in a different thread of execution relative to the thread that requested the IO computation, and there is no explicit synchronization (waiting) between a requesting and a computing threads. If a requesting thread waits (sleeps, blocks) while the computing thread is doing its work, we call such an operation synchronous. There are also mixed cases. Sometimes a requesting thread doesn't wait immediately and performs some fixed amount of useful work asynchronously after issuing an IO request, but later blocks (synchronizes) to await for the IO results if they are not yet readily available.

    Blocking vs. non-blocking

    In the broader sense, "blocking" and "non-blocking" can roughly be used to denote "synchronous" and "asynchronous" correspondingly. You will often encounter "blocking" to be used interchangeably with "synchronous" and "non-blocking" with "asynchronous". In this sense, "non-blocking asynchronous" is redundant as other folks mentioned above.

    However, in a more narrow sense "blocking" and "non-blocking" may refer to different kernel IO interfaces. It's worth saying here that all IO operations these days are performed by the OS kernel because access to IO hardware devices such as disks or network interface cards is abstracted away by the OS. It means that every IO operation that you request from your userspace code will end up being executed by the kernel via either blocking or non-blocking interface.

    When called via the blocking interface, the kernel will assume that your thread wants to obtain results synchronously and will put it to sleep (deschedule, block) until the IO results are available. Therefore that thread will not be able to do any other useful work while the kernel is fulfilling the IO request. As an example, all disk IO on Linux is blocking.

    Non-blocking kernel interfaces work differently. You tell the kernel which IO operations you want. The kernel doesn't block (deschedule) your thread and returns from the IO call immediately. Your thread can then move on and do some useful work. Kernel threads will fulfill the IO requests asynchronously. Your code then needs to check occasionally if the kernel has already done its job, after which you can consume the results. As an example, Linux provides the epoll interface for the non-blocking IO. There are also older poll and select system calls for the same purpose. It's worth noting that non-blocking interfaces mostly apply and are used for networking.

    Please, note that the fact that some higher-level IO APIs use blocking kernel IO under the hood doesn't mean that your thread will necessarily block when calling that API. Such an API may implement a mechanism to spawn a new or use a different existing thread to perform that blocking IO. It will notify your calling thread later through some means (a callback, an event, or by letting your thread poll) that it has completed the IO request. I.e., non-blocking IO semantics can be implemented in userspace by third-party libraries or runtimes on top of the blocking OS kernel interfaces by using additional threads.

    Conclusion

    To understand how each particular runtime or library achieves IO asynchronicity, you will have to go deeper and find out if it spawns new threads or relies upon asynchronous kernel interfaces.

    Afterword

    Realistically, there is very little chance you will encounter genuinely single-threaded systems these days.

    As en example, most people will refer to Node.js as having a "single-threaded non-blocking" IO. However, this is a simplification. On Linux, truly non-blocking IO is only available for network operations through the epoll interface. For disk IO, the kernel will always block the calling thread. To achieve asynchronicity for disk IO (which is relatively slow), Node.js runtime (or libuv to be precise) maintains a dedicated thread pool. Whenever an asynchronous disk IO operation is requested, the runtime assigns the work to one of the threads from that pool. That thread will do standard blocking disk IO, while the main (calling) thread will go on asynchronously. Not to mention numerous threads, which are maintained separately by V8 runtime for garbage collection and other managed runtime tasks.

提交回复
热议问题