Many small files or one big file? (Or, Overhead of opening and closing file handles) (C++)

后端 未结 6 742
春和景丽
春和景丽 2021-02-01 21:22

I have created an application that does the following:

  1. Make some calculations, write calculated data to a file - repeat for 500,000 times (over al
6条回答
  •  你的背包
    2021-02-01 21:41

    From your brief explaination it sounds like xtofl suggestion of threads is the correct way to go. I would recommend you profile your application first though to ensure that the time is divided between IO an cpu.

    Then I would consider three threads joined by two queues.

    1. Thread 1 reads files and loads them into ram, then places data/pointers in the queue. If the queue goes over a certain size the thread sleeps, if it goes below a certain size if starts again.
    2. Thread 2 reads the data off the queue and does the calculations then writes the data to the second queue
    3. Thread 3 reads the second queue and writes the data to disk

    You could consider merging thread 1 and 3, this might reduce contention on the disk as your app would only do one disk op at a time.

    Also how does the operating system handle all the files? Are they all in one directory? What is performance like when you browse the directory (gui filemanager/dir/ls)? If this performance is bad you might be working outside your file systems comfort zone. Although you could only change this on unix, some file systems are optimised for different types of file usage, eg large files, lots of small files etc. You could also consider splitting the files across different directories.

提交回复
热议问题