Erasing programs such as Eraser recommend overwriting data maybe 36 times.
As I understand it all data is stored on a hard drive as 1s or 0s.
If an overwrite
I've always wondered why the possibility that the file was previously stored in a different physical location on the disk isn't considered.
For example, if a defrag has just occurred there could easily be a copy of the file that's easily recoverable somewhere else on the disk.
Here's a Gutmann erasing implementation I put together. It uses the cryptographic random number generator to produce a strong block of random data.
public static void DeleteGutmann(string fileName)
{
var fi = new FileInfo(fileName);
if (!fi.Exists)
{
return;
}
const int GutmannPasses = 35;
var gutmanns = new byte[GutmannPasses][];
for (var i = 0; i < gutmanns.Length; i++)
{
if ((i == 14) || (i == 19) || (i == 25) || (i == 26) || (i == 27))
{
continue;
}
gutmanns[i] = new byte[fi.Length];
}
using (var rnd = new RNGCryptoServiceProvider())
{
for (var i = 0L; i < 4; i++)
{
rnd.GetBytes(gutmanns[i]);
rnd.GetBytes(gutmanns[31 + i]);
}
}
for (var i = 0L; i < fi.Length;)
{
gutmanns[4][i] = 0x55;
gutmanns[5][i] = 0xAA;
gutmanns[6][i] = 0x92;
gutmanns[7][i] = 0x49;
gutmanns[8][i] = 0x24;
gutmanns[10][i] = 0x11;
gutmanns[11][i] = 0x22;
gutmanns[12][i] = 0x33;
gutmanns[13][i] = 0x44;
gutmanns[15][i] = 0x66;
gutmanns[16][i] = 0x77;
gutmanns[17][i] = 0x88;
gutmanns[18][i] = 0x99;
gutmanns[20][i] = 0xBB;
gutmanns[21][i] = 0xCC;
gutmanns[22][i] = 0xDD;
gutmanns[23][i] = 0xEE;
gutmanns[24][i] = 0xFF;
gutmanns[28][i] = 0x6D;
gutmanns[29][i] = 0xB6;
gutmanns[30][i++] = 0xDB;
if (i >= fi.Length)
{
continue;
}
gutmanns[4][i] = 0x55;
gutmanns[5][i] = 0xAA;
gutmanns[6][i] = 0x49;
gutmanns[7][i] = 0x24;
gutmanns[8][i] = 0x92;
gutmanns[10][i] = 0x11;
gutmanns[11][i] = 0x22;
gutmanns[12][i] = 0x33;
gutmanns[13][i] = 0x44;
gutmanns[15][i] = 0x66;
gutmanns[16][i] = 0x77;
gutmanns[17][i] = 0x88;
gutmanns[18][i] = 0x99;
gutmanns[20][i] = 0xBB;
gutmanns[21][i] = 0xCC;
gutmanns[22][i] = 0xDD;
gutmanns[23][i] = 0xEE;
gutmanns[24][i] = 0xFF;
gutmanns[28][i] = 0xB6;
gutmanns[29][i] = 0xDB;
gutmanns[30][i++] = 0x6D;
if (i >= fi.Length)
{
continue;
}
gutmanns[4][i] = 0x55;
gutmanns[5][i] = 0xAA;
gutmanns[6][i] = 0x24;
gutmanns[7][i] = 0x92;
gutmanns[8][i] = 0x49;
gutmanns[10][i] = 0x11;
gutmanns[11][i] = 0x22;
gutmanns[12][i] = 0x33;
gutmanns[13][i] = 0x44;
gutmanns[15][i] = 0x66;
gutmanns[16][i] = 0x77;
gutmanns[17][i] = 0x88;
gutmanns[18][i] = 0x99;
gutmanns[20][i] = 0xBB;
gutmanns[21][i] = 0xCC;
gutmanns[22][i] = 0xDD;
gutmanns[23][i] = 0xEE;
gutmanns[24][i] = 0xFF;
gutmanns[28][i] = 0xDB;
gutmanns[29][i] = 0x6D;
gutmanns[30][i++] = 0xB6;
}
gutmanns[14] = gutmanns[4];
gutmanns[19] = gutmanns[5];
gutmanns[25] = gutmanns[6];
gutmanns[26] = gutmanns[7];
gutmanns[27] = gutmanns[8];
Stream s;
try
{
s = new FileStream(
fi.FullName,
FileMode.Open,
FileAccess.Write,
FileShare.None,
(int)fi.Length,
FileOptions.DeleteOnClose | FileOptions.RandomAccess | FileOptions.WriteThrough);
}
catch (UnauthorizedAccessException)
{
return;
}
catch (IOException)
{
return;
}
using (s)
{
if (!s.CanSeek || !s.CanWrite)
{
return;
}
for (var i = 0L; i < gutmanns.Length; i++)
{
s.Seek(0, SeekOrigin.Begin);
s.Write(gutmanns[i], 0, gutmanns[i].Length);
s.Flush();
}
}
}
A hard drive bit which used to be a 0, and is then changed to a '1', has a slightly weaker magnetic field than one which used to be a 1 and was then written to 1 again. With sensitive equipment the previous contents of each bit can be discerned with a reasonable degree of accuracy, by measuring the slight variances in strength. The result won't be exactly correct and there will be errors, but a good portion of the previous contents can be retrieved.
By the time you've scribbled over the bits 35 times, it is effectively impossible to discern what used to be there.
Edit: A modern analysis shows that a single overwritten bit can be recovered with only 56% accuracy. Trying to recover an entire byte is only accurate 0.97% of the time. So I was just repeating an urban legend. Overwriting multiple times might have been necessary when working with floppy disks or some other medium, but hard disks do not need it.
Daniel Feenberg (an economist at the private National Bureau of Economic Research) claims that the chances of overwritten data being recovered from a modern hard drive amount to "urban legend":
Can Intelligence Agencies Read Overwritten Data?
So theoretically overwriting the file once with zeroes would be sufficent.
Just invert the bits so that 1's are written to all 0's and 0's are written to all 1's then zero it all out that should get rid of any variable in the magnetic field and only takes 2 passes.
What we're looking at here is called "data remanence." In fact, most of the technologies that overwrite repeatedly are (harmlessly) doing more than what's actually necessary. There have been attempts to recover data from disks that have had data overwritten and with the exception of a few lab cases, there are really no examples of such a technique being successful.
When we talk about recovery methods, primarily you will see magnetic force microscopy as the silver bullet to get around a casual overwrite but even this has no recorded successes and can be quashed in any case by writing a good pattern of binary data across the region on your magnetic media (as opposed to simple 0000000000s).
Lastly, the 36 (actually 35) overwrites that you are referring to are recognized as dated and unnecessary today as the technique (known as the Gutmann method) was designed to accommodate the various - and usually unknown to the user - encoding methods used in technologies like RLL and MFM which you're not likely to run into anyhow. Even the US government guidelines state the one overwrite is sufficient to delete data, though for administrative purposes they do not consider this acceptable for "sanitization". The suggested reason for this disparity is that "bad" sectors can be marked bad by the disk hardware and not properly overwritten when the time comes to do the overwrite, therefore leaving the possibility open that visual inspection of the disk will be able to recover these regions.
In the end - writing with a 1010101010101010 or fairly random pattern is enough to erase data to the point that known techniques cannot recover it.