Fetching google images using htmlagilitypack

让人想犯罪 __ 提交于 2019-12-08 13:32:57

问题


I would like to execute a query on google images to fetch images using htmlagilitypack in c#. For this I used an xpath request to the image

//*[@id="rg_s"]/div[1]/a/img

But it fails to fetch the image that way. What could be the correct way of doing this?


回答1:


you can try this too : Here its possible to get the links of images by following

var links = HtmlDocument.DocumentNode.SelectNodes("//a").Where(a => a.InnerHtml.Contains("<img")).Select(b => b.Attributes["href"].Value).ToList();
 foreach(var link in links)
  {
      // you can save the link or do your process here
  }



回答2:


Google keeps found images in div tags with class rg_di. Here is a query to get all links to images:

var links = hdoc.DocumentNode.SelectNodes(@"//div[@class='rg_di']/a")
                .Select(a => a.GetAttributeValue("href", ""));



回答3:


Searching google programmatically outside of their API's is against the TOS. Consider Google Custom Search or Bing Search API, both of which have established JSON and SOAP interfaces.

Both are free for a couple thousand queries per month and comply with the service's TOS.

Edit: Examples of using Bing API with C# below:

const string bingKey = "[your key here]";
var bing = new BingSearchContainer(new Uri("https://api.datamarket.azure.com/Bing/Search/")) 
{ 
    Credentials = new NetworkCredential(bingKey, bingKey) 
};

var query = bing.Web("Jon Gallant blog", null, null, null, null, null, null, null);
var results = query.Execute();

foreach(var result in results)
{
    Console.WriteLine(result.Url);
}
Console.ReadKey();

Google custom search API:

string apiKey = "Your api key";
string cx = "Your custom search engine id";
string query = "Your query";

var svc = new Google.Apis.Customsearch.v1.CustomsearchService(new BaseClientService.Initializer { ApiKey = apiKey });
var listRequest = svc.Cse.List(query);

listRequest.Cx = cx;
var search = listRequest.Fetch();

foreach (var result in search.Items)
{
    Response.Output.WriteLine("Title: {0}", result.Title);
    Response.Output.WriteLine("Link: {0}", result.Link);
}


来源:https://stackoverflow.com/questions/20210315/fetching-google-images-using-htmlagilitypack

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!