问题
There is a site that I want to retrieve from Google Cache that had thousands of pages. Is there any way I can get it back quickly using Google Cache or some other web crawler/archiver?
回答1:
You can see what Google (still) knows about a website by using a site
restrict:
http://www.google.com/search?q=site:[domain]
You might also check out the Internet Archive.
(In either case, you’d probably want to do some heavy-duty automating to fetch thousands of pages.)
回答2:
I created a free service to recover your website which can retrieve most pages from the search engines cache.
The output of the service is a zipped file with your HTML from the search engines cache. It is still in beta so it still needs a lot of tweaks and bugfixes, but hopefully it can help you or other people who experience the same problem.
UPDATE: I didn't have time to continue the development of the service so it is closed.
来源:https://stackoverflow.com/questions/3359882/retrieving-an-entire-website-using-google-cache