问题
Here i am using two different functions for calculating CRC16 for any type of file (.txt,.tar,.tar.gz,.bin,.scr,.sh etc)
and different size also varies from 1 KB to 5 GB
.
I want to achieve this
`cross platform
less time consuming
Have to work proper for any type of file and any size`
i got same value of CRC in both functions. but any one can tell me which one is more better to calculate CRC16 for any type of file with any size on different different platform.
Here we have to consider 0 to 255 all type characters.
Can any body please suggest me which one is good in my requirements.
Code of both functions :
First one which has int
datatype in readChar
here i am using int readChar
int CRC16_int(const char* filePath) {
//Declare variable to store CRC result.
unsigned short result;
//Declare loop variables.
int intInnerLoopIndex;
result = 0xffff; //initialize result variable to perform CRC checksum calculation.
//Store message which read from file.
//char content[2000000];
//Create file pointer to open and read file.
FILE *readFile;
//Use to read character from file.
int readChar;
//open a file for Reading
readFile = fopen(filePath, "rb");
//Checking file is able to open or exists.
if (!readFile) {
fputs("Unable to open file %s", stderr);
}
/*
Here reading file and store into variable.
*/
int chCnt = 0;
while ((readChar = getc(readFile)) != EOF) {
//printf("charcater is %c\n",readChar);
//printf("charcater is %c and int is %d \n",readChar,readChar);
result ^= (short) (readChar);
for (intInnerLoopIndex = 0; intInnerLoopIndex < 8; intInnerLoopIndex++) {
if ((result & 0x0001) == 0x0001) {
result = result >> 1; //Perform bit shifting.
result = result ^ 0xa001; //Perform XOR operation on result.
} else {
result = result >> 1; //Perform bit shifting.
}
}
//content[chCnt] = readChar;
chCnt++;
}
printf("\nCRC data length in file: %d", chCnt);
//This is final CRC value for provided message.
return (result);
}
Second one is unsigned char
datatype of readChar
Here i am using unsigned char readChar
int CRC16_unchar(const char* filePath) {
unsigned int filesize;
//Declare variable to store CRC result.
unsigned short result;
//Declare loop variables.
unsigned int intOuterLoopIndex, intInnerLoopIndex;
result = 0xffff; //initialize result variable to perform CRC checksum calculation.
FILE *readFile;
//Use to read character from file.
//The problem is if you read a byte from a file with the hex value (for example) 0xfe,
//then the char value will be -2 while the unsigned char value will be 254.
//This will significantly affect your CRC
unsigned char readChar;
//open a file for Reading
readFile = fopen(filePath, "rb");
//Checking file is able to open or exists.
if (!readFile) {
fputs("Unable to open file %s", stderr);
}
fseek(readFile, 0, SEEK_END); // seek to end of file
filesize = ftell(readFile); // get current file pointer
fseek(readFile, 0, SEEK_SET); // seek back to beginning of file
/*
Here reading file and store into variable.
*/
int chCnt = 0;
for (intOuterLoopIndex = 0; intOuterLoopIndex < filesize; intOuterLoopIndex++) {
readChar = getc(readFile);
printf("charcater is %c and int is %d\n",readChar,readChar);
result ^= (short) (readChar);
for (intInnerLoopIndex = 0; intInnerLoopIndex < 8; intInnerLoopIndex++) {
if ((result & 0x0001) == 0x0001) {
result = result >> 1; //Perform bit shifting.
result = result ^ 0xa001; //Perform XOR operation on
} else {
result = result >> 1; //Perform bit shifting.
}
}
chCnt++;
}
printf("\nCRC data length in file: %d", chCnt);
return (result);
}
Please Help me to figure out this problem
Thanks
回答1:
First things first. Don't do file reading (or whatever the source is) and CRC calculating in the same function. This is bad design. File reading is typically not completely platform independent (although POSIX is your best friend), but CRC calculation can be done very platform independently. Also you might want to reuse your CRC algorithm for other kind of data sources which aren't accessed with fopen()
.
To give you a hint, the CRC function I always drop in to my projects has this prototype:
uint16_t Crc16(const uint8_t* buffer, size_t size,
uint16_t polynomial, uint16_t crc);
You don't have to call the function once and feed it the complete contents of the file. Instead you can loop through the file in blocks and call the function for each block. The polynomial
argument in your case is 0xA001
(which is BTW a polynomial in 'reversed' form), and the crc
argument is set to 0xFFFF
the first time. Each subsequent time you call the function you pass the previous return value of the function to the crc
argument.
In your second code frament (CRC16_unchar
) you first determine the filesize and then read that number of bytes. Don't do that, it unnecessary limits you to handle files of maximum 4GB (in the most cases). Just reading until EOF is cleaner IMHO.
Furthermore I see that you are struggling with signed/unsigned bytes. Do know that
printf
doesn't know if you pass an signed or unsigned integer. You tellprintf
with '%d' or '%u' how to interpret the integer.- Even in C itself there is hardly a difference between a signed and unsigned integer. C won't magically change the value of 255 to -1 if you do
int8_t x = 255
.
See this anser for more details about when C uses the signedness of an integer: When does the signedness of an integer really matter?. Rule of thumb: Just always use uint8_t
for handling raw bytes.
So both functions are fine regarding signedness/integer size.
EDIT: As other users indicated in their answers, read the file in block instead per-byte:
uint16_t CRC16_int(const char* filePath) {
FILE *readFile;
const uint8_t buf[1024];
size_t len;
uint16_t result = 0xffff;;
/* Open a file for reading. */
readFile = fopen(filePath, "rb");
if (readFile == NULL) {
exit(1);
}
/* Read until EOF. */
while ( (len = fread(buf, sizeof(buf), 1, readFile)) > 0 ) {
result = Crc16(buf, len, 0xA001, result);
}
/* readFile could be in error state, check it with ferror() or feof() functions. */
return result;
}
Also you should alter you function prototype to make it possible to return an error, e.g.:
// Return true when successful, false on error. CRC is stored in result.
bool CRC16_int(const char* filePath, uint16_t *result)
回答2:
You want to read and write 8-bit bytes using unsigned char
instead of plain char
because char
can be either signed or unsigned and that's up to the compiler (allowed by the C standard). So, the value you get from getc()
should be converted to unsigned char
prior to being used in the CRC calculations. You could also fread()
into an unsigned char
. If you work with signed chars, sign extension of chars into ints will likely break your CRC calculations.
Also, per the C standard fseek(FilePtr, 0, SEEK_END)
has undefined behavior for binary streams and binary streams need not meaningfully support SEEK_END
in fseek()
. In practice, though, this usually works as we want.
Another thing you should consider is checking for I/O errors. Your code is broken in this respect.
回答3:
The datatype you do the calculation with should, in my opinion, not be the same that you read from the file. Doing one function call into the runtime library to read a single byte is simply not efficient. You should probably read on the order of 2-4 KB at a time, and then iterate over each returned "chunk" in whatever manner you choose.
There's also absolutely no point in reading in the size of the file in advance, you should simply read until reading returns less data than expected, in which case you can inspect feof()
and ferror()
to figure out what to do, typically just stop since you're done. See the fread()
manual page.
来源:https://stackoverflow.com/questions/9428468/which-datatype-is-better-in-calculation-of-crc16-for-any-type-of-file