问题
I would like to execute a query on google images to fetch images using htmlagilitypack in c#. For this I used an xpath request to the image
//*[@id="rg_s"]/div[1]/a/img
But it fails to fetch the image that way. What could be the correct way of doing this?
回答1:
you can try this too : Here its possible to get the links of images by following
var links = HtmlDocument.DocumentNode.SelectNodes("//a").Where(a => a.InnerHtml.Contains("<img")).Select(b => b.Attributes["href"].Value).ToList();
foreach(var link in links)
{
// you can save the link or do your process here
}
回答2:
Google keeps found images in div
tags with class rg_di
. Here is a query to get all links to images:
var links = hdoc.DocumentNode.SelectNodes(@"//div[@class='rg_di']/a")
.Select(a => a.GetAttributeValue("href", ""));
回答3:
Searching google programmatically outside of their API's is against the TOS. Consider Google Custom Search or Bing Search API, both of which have established JSON and SOAP interfaces.
Both are free for a couple thousand queries per month and comply with the service's TOS.
Edit: Examples of using Bing API with C# below:
const string bingKey = "[your key here]";
var bing = new BingSearchContainer(new Uri("https://api.datamarket.azure.com/Bing/Search/"))
{
Credentials = new NetworkCredential(bingKey, bingKey)
};
var query = bing.Web("Jon Gallant blog", null, null, null, null, null, null, null);
var results = query.Execute();
foreach(var result in results)
{
Console.WriteLine(result.Url);
}
Console.ReadKey();
Google custom search API:
string apiKey = "Your api key";
string cx = "Your custom search engine id";
string query = "Your query";
var svc = new Google.Apis.Customsearch.v1.CustomsearchService(new BaseClientService.Initializer { ApiKey = apiKey });
var listRequest = svc.Cse.List(query);
listRequest.Cx = cx;
var search = listRequest.Fetch();
foreach (var result in search.Items)
{
Response.Output.WriteLine("Title: {0}", result.Title);
Response.Output.WriteLine("Link: {0}", result.Link);
}
来源:https://stackoverflow.com/questions/20210315/fetching-google-images-using-htmlagilitypack