问题
I am using the Google API for .Net http://code.google.com/p/google-api-for-dotnet/ and no matter how many results I ask for Google never returns more than 64.
Here is my code snippet:
GwebSearchClient client = new GwebSearchClient("xyz");
IList<IWebResult> results = client.Search(this.SearchText.Text, 100);
I expected to get 100 results, but never get more than 64 irrespective of the search term used.
Any ideas?
回答1:
According to the Google AJAX Search API (which uses the same HTTP requests to Google servers as the .NET API,) the maximum returned results are 64.
Note: The maximum number of results pages is based on the type of searcher. Local search supports 4 pages (or a maximum of 32 total results) and the other searchers (Blog, Book, Image, News, Patent, Video, and Web) support 8 pages (for a maximum total of 64 results).
From here, scroll two lines up. Or search the page for "maximum number".
回答2:
There is always the option of parsing html:
I needed approximately 200,000,000 (or at least 24M) results, and Since the API wasn't cutting it, I decided to download the html results and parse them manually using regular expressions. With HashTables, I was able to eliminate any duplicates.
My regular expression:
(parse only URLs with the given domain, and contain subdomains with 3-20 alphanumeric chars)
@"((?!www)([A-Za-z0-9-]{3,20})(\.example\.com))"
HTML URL Used:
[C# Source]
String.Format( "http://www.google.com/search?q=site:{0}&num={1}"+
"&hl=en&tbo=d&as_qdr=all&start={2}&sa=N&biw=1280&bih=709",
"example.com", count, start)
This has been tested in my own applications and yields rather nice results!
来源:https://stackoverflow.com/questions/3521121/google-not-returning-more-than-64-results