Why fstream::tellg() return value is enlarged by the number of newlines in the input text file, when file is formated for Windows (\r\n)?

霸气de小男生 提交于 2019-12-12 01:59:35

问题


Program openes input file and prints current reading/writing position several times.

If file is formated with '\n' for newline, values are as expected: 0, 1, 2, 3.

On the other side, if the newline is '\r\n' it appears that after some reading, current position returned by all tellg() calls are offsetted by the number of newlines in the file - output is: 0, 5, 6, 7.

All returned values are increased by 4, which is a number of newlines in example input file.

#include <fstream>
#include <iostream>
#include <iomanip>
using std::cout;
using std::setw;
using std::endl;

int main()
{
    std::fstream ioff("su9.txt");
    if(!ioff) return -1;
    int c = 0;

    cout << setw(30) << std::left << " Before any operation " << ioff.tellg() << endl;

    c = ioff.get();
    cout << setw(30) << std::left << " After first 'get' " << ioff.tellg() << " Character read: " << (char)c << endl;

    c = ioff.get();
    cout << setw(30) << std::left << " After second 'get' " << ioff.tellg() << " Character read: " << (char)c << endl;

    c = ioff.get();
    cout << setw(30) << std::left << " Third 'get' " << ioff.tellg() << "\t\tCharacter read: " << (char)c << endl;

    return 0;
}

Input file is 5 lines long (has 4 newlines), with a content:

-------------------------------------------
abcd
efgh
ijkl


--------------------------------------------

output (\n):

Before any operation         0
After first 'get'            1      Character read: a
After second 'get'           2      Character read: b
Third 'get'                  3      Character read: c

output (\r\n):

Before any operation         0
After first 'get'            5      Character read: a
After second 'get'           6      Character read: b
Third 'get'                  7      Character read: c

Notice that character values are read corectly.


回答1:


The first, and most obvious question, is why do you expect any particular values when teh results of tellg are converted to an integral type. The only defined use of the results of tellg is as a later argument to seekg; they have no defined numerical significance what so ever.

Having said that: in Unix and Windows implementations, they will practically always correspond to the byte offset of the physical position in the file. Which means that they will have some signification if the file is opened in binary mode; under Windows, for example, text mode (the default) maps the two character sequence 0x0D, 0x0A in the file to the single character '\n', and treats the single character 0x1A as if it had encountered end of file. (Binary and text mode are indentical under Unix, so things often seem to work there even when they aren't guaranteed.)

I might add that I cannot reproduce your results with MSC++. Not that that means anything; as I said, the only requirements for tellg is that the returned value can be used in a seekg to return to the same place. (Another issue might be how you created the files. Might one of them start with a UTF-8 encoding of a BOM, for example, and the other not?)



来源:https://stackoverflow.com/questions/27234202/why-fstreamtellg-return-value-is-enlarged-by-the-number-of-newlines-in-the-i

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!