Save a CRC value in a file, without altering the actual CRC Checksum?

痞子三分冷 提交于 2019-12-22 06:37:51

问题


I am saving some Objects I have defined from my own classes, to File. (saving the stream data).

That is all fine, but I would like to be able to store in the File the CRC checksum of that File.

Then, whenever my Application attemps to Open a File, it can read the internally stored CRC value.

Then perform a check on the actual File, if the CRC of the File matches the internally stored CRC value I can process the File normally, otherwise display an error message to say the File is not valid.

I need some advice on how to do this though, I thought I could do something like this:

  • Save the File from my Application.
  • Calculate the CRC of the Saved File.
  • Edit the Saved File storing the CRC Value.
  • Whenever a File is Opened, Check the CRC matches internal CRC Value.

Problem is, as soon as a single Byte of Data is altered in the File, results in the CRC checksum being completely different - as expected.


回答1:


Simply put you need to exclude the bytes used to store the checksum from the checksum calculation.

Write the checksum as the last thing in the file. Calculate it based on the contents of the file apart from the checksum. When you come to read the file calculate the checksum based on the contents before the checksum. Or you could write the checksum as the first bytes of the file with random access. Just so long as you know where it is.




回答2:


I'd generally prefer the approach where the CRC is excluded from the checking. But if that's not possible for some reason, there is a workaround:

You need to reserve 8 bytes, 4 for the CRC, and 4 for compensation data. First fill the reserved bytes with a certain dummy value (say 0x00). Then calculate the CRC into the first 4 bytes, and finally change the other 4 bytes so the CRC of the file stays the same.

For details on how to perform this calculation: Reversing CRC32


I actually used this in one of my projects:

I was designing a file format based on zip. The first file in the archive is stored uncompressed and serves as header file. This also means it is stored at a fixed offset in the file. So far pretty standard, and similar to for example ePub.

Now I decided to include a sha1 hash in the header, to give each file a unique content based Id and for integrity checking. Since the header and thus the sha1 hash is at a known offset in the file, masking it when hashing is trivial. So I put in a dummy hash and create the zip file, then hash the file and fill in the real hash.

But now there is a problem: Zip stores the CRC of all contained files. And not only in one place which would be easy to mask when sha1-hashing, but in a second place with variable offset near the end of the file. So I decided to go with CRC faking, so I get my strong hash, and zip gets its valid CRC32.

And since I was already faking the CRC for the final file, I decided faking it for the original header file wouldn't hurt either. Thus all files in this format now start with a header file that has the CRC 0xD1CE0DD5.




回答3:


Store the CRC as part of the file itself, but don't include the data for it in the CRC calculation. If you have some sort of fixed header zero out the CRC field before passing it to the CRC function. If not, just append it to the end of the file and pass everything but the last 4 bytes into the CRC function.


Alternatively, if the files are stored on an NTFS drive and you don't need to transfer them to another computer you can use NTFS Alternate Data Streams to store the CRCs. Basically you open the file with the ADS name separated from the filename by a colon (like C:\file.txt:CRC). Windows handles the difference internally, so you can use plain TFileStream functions to manipulate them.

Alternate data streams are stored separately from the standard file stream, so opening or modifying just C:\file.txt won't affect it.

So, the code would look like this:

procedure UpdateCRC(const aFileName: string);
var
  FileStream, ADSStream: TStream;
  CRC: LongWord;
begin
  FileStream := TFileStream.Create(aFileName, fmOpenRead);
  try
    CRC := CrcOf(FileStream);
  finally
    FileStream.Free;
  end;

  ADSStream := TFileStream.Create(aFileName + ':CRC', fmCreate);
  try
    ADSStream.WriteBuffer(CRC, SizeOf(CRC));
  finally
    ADSStream.Free;
  end;
end;

If you need to find all of the alternate data streams attached to a file (there can be more than one), you can iterate over them using BackupRead. Internet Explorer uses ADSs to support the "This file has been downloaded from the Internet. Are you sure you want to open it?" prompt.




回答4:


I would recommend storing the checksum in another file, maybe a .ini file. Or for a really weird idea, you could incorporate the checksum as part of the filename.
i.e. MyFile_checksum_digits_here.dat



来源:https://stackoverflow.com/questions/8608219/save-a-crc-value-in-a-file-without-altering-the-actual-crc-checksum

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!