How to scan through really huge files on disk?

后端 未结 8 2084
北荒
北荒 2020-12-09 18:14

Considering a really huge file(maybe more than 4GB) on disk,I want to scan through this file and calculate the times of a specific binary pattern occurs.

My thought

相关标签:
8条回答
  • 2020-12-09 18:54

    I'd go with only one thread too, not only for HD performance issues, but because you might have trouble managing side effects when splitting your file : what if there's an occurrence of your pattern right where you split your file ?

    0 讨论(0)
  • 2020-12-09 18:58

    I would have one thread read the file (possibly as a stream) into an array and have another thread process it. I wouldnt map several at one time because of disk seeks. I would probably have a ManualResetEvent to tell my thread when the next ? bytes are ready to be processed. Assuming your process code is faster then the hdd i would have 2 buffers, one to fill and the other to process and just switch between them each time.

    0 讨论(0)
提交回复
热议问题