Efficiently reading a very large text file in C++

前端 未结 4 1670
攒了一身酷
攒了一身酷 2020-12-02 16:13

I have a very large text file(45GB). Each line of the text file contains two space separated 64bit unsigned integers as shown below.

4624996948753406865 10214715013

4条回答
  •  庸人自扰
    2020-12-02 16:57

    On Linux, using C instead of C++ streams might help performance (because C++ streams are built above FILE-s). You could use readline(3) or fgets(3) or fscanf(3). You might set a larger buffer (e.g. 64Kbytes or 256Kbytes) using setbuffer(3) etc... But I guess your (improved) program would be I/O bound, not CPU bound. Then you could play with posix_fadvise(2)

    You might consider using memory mapping mmap(2) & madvise(2) (see also m mode for fopen(3)). See also readahead(2)

    At last, if your algorithm permits it, you might csplit the files in smaller pieces and process each of them in parallel processes.

提交回复
热议问题