C reading (from stdin) stops at 0x1a character

狂风中的少年 提交于 2019-11-26 23:26:08

问题


currently I'm implementing the Burrows-Wheeler transform (and inverse transform) for raw data (like jpg etc.). When testing on normal data like textfiles no problems occur. But when it comes to reading jpg files e.g. it stops reading at character 0x1a aka substitute character. I've been searching through the internet for solutions which doesn't take OS dependend code but without results... I was thinking to read in stdin in binary mode but that isn't quite easy I guess. Is there any simple method to solve this problem?

code:

buffer = (unsigned char*) calloc(block_size+1,sizeof(unsigned char));
length = fread((unsigned char*) buffer, 1, block_size, stdin);
if(length == 0){
    // file is empty
}else{
    b_length = length;
    while(length == b_length){
        buffer[block_size] = '\0';
        encodeBlock(buffer,length);
        length = fread((unsigned char*) buffer, 1, block_size, stdin);      
    }
    if(length != 0){            
        buffer[length] = '\0';
        encodeBlock(buffer,length);
    }
}
free(buffer);

回答1:


As you've noticed, you're reading from stdin in ASCII mode and it is hitting the SUB character (substitute, aka CTRL+Z, aka DOS End-of-File).

You have to change the mode to binary with setmode while on Windows:

#if defined(WIN32)
#include <io.h>
#include <fcntl.h>
#endif /* defined(WIN32) */

/* ... */

#if defined(WIN32)
_setmode(_fileno(stdin), _O_BINARY);
#endif /* defined(WIN32) */

On platforms other than Windows you don't run into this distinction in modes.




回答2:


You cannot do this without an OS dependency. The C language specification says (7.19.3)

At program startup, three text streams are predefined...

stdin is a text stream. Depending on your OS, there may be ways to change the mode of an existing stream or access the low-level stream data, but you claim that you do not want any OS-specific code.




回答3:


You must open the file as a binary file.

Use something similar to

fopen("file", "rb");



回答4:


You can use _setmode to convert stdin to binary mode.

There is also freopen -- see this SO question




回答5:


Use read() to read in the data.
Since you are interested in getting data from the stdin, use

fd = fcntl(STDIN_FILENO, F_DUPFD, 0);

to obtain the fd of stdin.

More info here.

The issue has something to do with the fact that windows treats 0x1a a.k.a. CTRL+Z as the EOF. As Earlz pointed out, opening it in binary mode fixes this on windows and works on linux too.



来源:https://stackoverflow.com/questions/12942518/c-reading-from-stdin-stops-at-0x1a-character

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!