Preface:
I am doing a data-import that has a verify-commit phase. The idea is that: the first phase allows taking data from various sources and then running various insert/update/validate operations on a database. The commit is rolled back but a "verification hash/checksum" is generated. The commit phase is the same, but, if the "verification hash/checksum" is the same then the operations will be committed. (The database will be running under the appropriate isolation levels.)
Restrictions:
- Input reading and operations are forward-read-once only
- Do not want to pre-create a stream (e.g. writing to MemoryStream not desirable) as there may be a lot of data. (It would work on our servers/load, but pretend memory is limited.)
- Do not want to "create my own". (I am aware of available code like CRC-32 by Damien which I could use/modify but would prefer something "standard".)
And what I (think I am) looking for:
A way to generate a Hash (e.g. SHA1 or MD5?) or a Checksum (e.g. CRC32 but hopefully more) based on input + operations. (The input/operations could themselves be hashed to values more fitting to the checksum generation but it would be nice just to be able to "write to steam".)
So, the question is:
How to generate a Running Hash (or Checksum) in C#?
Also, while there are CRC32 implementations that can be modified for a Running operation, what about running SHAx or MD5 hashes?
Am I missing some sort of handy Stream approach than could be used as an adapter?
(Critiques are welcome, but please also answer the above as applicable. Also, I would prefer not to deal with threads. ;-)
Hashes have a build and a finalization phase. You can shove arbitrary amounts of data in during the build phase. The data can be split up as you like. Finally, you finish the hash operation and get your hash.
You can use a writable CryptoStream to write your data. This is the easiest way.
You can call HashAlgorithm.TransformBlock multiple times, and then calling TransformFinalBlock will give you the result of all blocks.
Chunk up your input (by reading x amount of bytes from a steam) and call TransformBlock with each chunk.
EDIT (from the msdn example):
public static void PrintHashMultiBlock(byte[] input, int size)
{
SHA256Managed sha = new SHA256Managed();
int offset = 0;
while (input.Length - offset >= size)
offset += sha.TransformBlock(input, offset, size, input, offset);
sha.TransformFinalBlock(input, offset, input.Length - offset);
Console.WriteLine("MultiBlock {0:00}: {1}", size, BytesToStr(sha.Hash));
}
Sorry I don't have any example readily available, though for you, you're basically replacing input with your own chunk, then the size would be the number of bytes in that chunk. You will have to keep track of the offset yourself.
You can generate an MD5 hash using the MD5CryptoServiceProvider's ComputeHash method. It takes a stream as input.
Create a memory or file stream, write your hash inputs to that, and then call the ComputeHash method when you are done.
var myStream = new MemoryStream();
// Blah blah, write to the stream...
myStream.Position = 0;
using (var csp = new MD5CryptoServiceProvider()) {
var myHash = csp.ComputeHash(myStream);
}
EDIT: One possibility to avoid building up massive Streams is calling this over and over in a loop and XORing the results:
// Assuming we had this somewhere:
Byte[] myRunningHash = new Byte[16];
// Later on, from above:
for (var i = 0; i < 16; i++) // I believe MD5 are 16-byte arrays. Edit accordingly.
myRunningHash[i] = myRunningHash[i] ^ [myHash[i];
EDIT #2: Finally, building on @usr's answer below, you can probably use HashCore and HashFinal:
using (var csp = new MD5CryptoServiceProvider()) {
// My example here uses a foreach loop, but an
// event-driven stream-like approach is
// probably more what you are doing here.
foreach (byte[] someData in myDataThings)
csp.HashCore(someData, 0, someData.Length);
var myHash = csp.HashFinal();
}
this is the canonical way:
using System;
using System.Security.Cryptography;
using System.Text;
public void CreateHash(string sSourceData)
{
byte[] sourceBytes;
byte[] hashBytes;
//create Bytearray from source data
sourceBytes = ASCIIEncoding.ASCII.GetBytes(sSourceData);
// calculate 16 Byte Hashcode
hashBytes = new MD5CryptoServiceProvider().ComputeHash(sourceBytes);
string sOutput = ByteArrayToHexString(hashBytes);
}
static string ByteArrayToHexString(byte[] arrInput)
{
int i;
StringBuilder sOutput = new StringBuilder(arrInput.Length);
for (i = 0; i < arrInput.Length - 1; i++)
{
sOutput.Append(arrInput[i].ToString("X2"));
}
return sOutput.ToString();
}
来源:https://stackoverflow.com/questions/10985282/generate-running-hash-or-checksum-in-c