HTMLAgilityPack SelectNodes to select all <img> elements

北慕城南 提交于 2019-11-28 07:10:58

问题


I am making a project in C# that's basically an image screen scraper for an image-search related game. I'm trying to use HTMLAgilityPack to select all the image elements and put them in an HTMLNodeCollection, like this:

//set up for checking autos

HtmlNodeCollection imgs = new HtmlNodeCollection(doc.DocumentNode.ParentNode);
imgs = doc.DocumentNode.SelectNodes("//img");

foreach (HtmlNode img in imgs)
{
    HtmlAttribute src = img.Attributes["@src"];
    urls.Add(src.Value);
}

Note that urls is a public List collection:

public List<string> urls = new List<string>();

My foreach loop is throwing an exception:

Object reference not set to an instance of an object.

Checking the autos, sure enough, imgs is null. Is there any better way I can track down the source of this problem? I have no idea if it's my Xpath or what.

The most frustrating part is that I had already gotten it to work, but messed up my file versions and lost my work. Derp.


回答1:


You might have a typo in the following line:

HtmlAttribute src = img.Attributes["@src"];

I got this to work for me (notice the @ position):

HtmlAttribute src = img.Attributes[@"src"];



回答2:


This works for me. I think your document isn't loaded correctly, hence the xpath returns no matches.

HtmlDocument htmlDocument = new HtmlDocument();
htmlDocument.LoadHtml("<html><head></head><body><div><img /><div><img /><img/></div></div><img/></body></html>");

var nodes = htmlDocument.DocumentNode.SelectNodes("//img");
// 4 nodes found
foreach (var node in nodes)
{
    // do stuff
}


来源:https://stackoverflow.com/questions/7883449/htmlagilitypack-selectnodes-to-select-all-img-elements

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!