How to scan through really huge files on disk?

后端 未结 8 2105
北荒
北荒 2020-12-09 18:14

Considering a really huge file(maybe more than 4GB) on disk,I want to scan through this file and calculate the times of a specific binary pattern occurs.

My thought

8条回答
  •  予麋鹿
    予麋鹿 (楼主)
    2020-12-09 18:47

    Creating 20 threads, each supposing to handle some 100 MB of the file is likely to only worsen performance since The HD will have to read from several unrelated places at the same time.

    HD performance is at its peak when it reads sequential data. So assuming your huge file is not fragmented, the best thing to do would probably be to use just one thread and read from start to end in chunks of a few (say 4) MB.

    But what do I know. File systems and caches are complex. Do some testing and see what works best.

提交回复
热议问题