About the speed of random file read (Python)

只愿长相守 提交于 2019-12-06 15:53:42
dbort

what is behind and responsible for this speed-up?

It could be the operating system's disk cache. http://en.wikipedia.org/wiki/Page_cache

Once you've read a chunk of a file from disk once, it will hang around in RAM for a while. RAM is orders of magnitude faster than disk, so you'll see a lot of variability in the time it takes to read random pieces of a large file.

Or, depending on what "db" is, the database implementation could be doing its own caching.

Is there anyway to control it?

If it's the disk cache:

It depends on the operating system, but it's typically a pretty coarse-grained control; for example, you may be forced to disable caching for an entire volume, which would affect other processes on the system reading from that volume, and would affect every other file that lived on that volume. It would also probably require root/admin access.

See this similar question about disabling caching on Linux: Linux : Disabling File cache for a process?

Depending on what you're trying to do, you can force-flush the disk cache. This can be useful in situations where you want to run a test with a cold cache, letting you get an idea of the worst-case performance. (This also depends on your OS and may require root/admin access.)

If it's the database:

Depends on the database. If it's a local database, you may just be seeing disk cache effects, or the database library could be doing its own caching. If you're talking to a remote database, the caching could be happening locally or remotely (or both).

There may be configuration options to disable or control caching at either of these layers.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!