A forum I frequent was down today, and upon restoration, I discovered that the last two days of forum posting had been rolled back completely.
Needless to say, I\'d
Note: The below answer is out of date since the Chrome disk cache format has changed.
Joachim Metz provides some documentation of the Chrome cache file format with references to further information.
For my use case, I only needed a list of cached URLs and their respective timestamps. I wrote a Python script to get these by parsing the data_* files under C:\Users\me\AppData\Local\Google\Chrome\User Data\Default\Cache\
:
import datetime
with open('data_1', 'rb') as datafile:
data = datafile.read()
for ptr in range(len(data)):
fourBytes = data[ptr : ptr + 4]
if fourBytes == b'http':
# Found the string 'http'. Hopefully this is a Cache Entry
endUrl = data.index(b'\x00', ptr)
urlBytes = data[ptr : endUrl]
try:
url = urlBytes.decode('utf-8')
except:
continue
# Extract the corresponding timestamp
try:
timeBytes = data[ptr - 72 : ptr - 64]
timeInt = int.from_bytes(timeBytes, byteorder='little')
secondsSince1601 = timeInt / 1000000
jan1601 = datetime.datetime(1601, 1, 1, 0, 0, 0)
timeStamp = jan1601 + datetime.timedelta(seconds=secondsSince1601)
except:
continue
print('{} {}'.format(str(timeStamp)[:19], url))