I face the challenge of reading/writing files (in Gigs) line by line.
Reading many forum entries and sites (including a bunch of SO\'s), mmap was suggested as the f
You're using stringstreams to store the lines you identify. This is not comparable with the getline implementation, the stringstream itself adds overhead. As other suggested, you can store the beginning of the string as a char*, and maybe the length of the line (or a pointer to the end of the line). The body of the read would be something like:
char* str_start = map;
char* str_end;
for (long i = 0; i <= FILESIZE; ++i) {
if (map[i] == '\n') {
str_end = map + i;
{
// C style tokenizing of the string str_start to str_end
// If you want, you can build a std::string like:
// std::string line(str_start,str_end);
// but note that this implies a memory copy.
}
str_start = map + i + 1;
}
}
Note also that this is much more efficient because you don't process anything in each char (in your version you were adding the character to the stringstream).