问题
I am attempting to access the company name from this page.
Should return a node with innertext of "Cascade corporation" however I get null instead.
HtmlNode htest = document.DocumentNode.SelectSingleNode("//*[@id='appbar']/div/div[2]/div[1]/span");
what am I missing?
P.S. must work with Chrome
回答1:
I tried to reproduce your issue on my machine. I captured request & response data using Fiddler. I was surprised to notice that rendered html output from browser is different from my code.
From Fiddler the difference I noticed is the user agent value. Later I came up with this code and it works for me. Can you please try it and let me know. Please don't down vote me. I'm trying to help you ya..
string url = "http://www.google.com/finance?q=NASDAQ:TXN&fstype=ii";
HtmlWeb web = new HtmlWeb();
web.UserAgent = "Mozilla/5.0 (Windows NT 6.1; rv:12.0) Gecko/20100101 Firefox/12.0"; // latest firefox
HtmlDocument doc = web.Load(url);
var node = doc.DocumentNode.SelectSingleNode("//*[@id='appbar']/div/div[2]/div[1]/span")
//var node = doc.DocumentNode.SelectSingleNode("//div[@class='appbar-snippet-primary']/span")
When I comment user agent line I am able to reproduce your issue. Hope it helps.
回答2:
on the page you are linking to there's no element with the id appbar
there's only a div with a class called appbar-hide
and that's the only place appbar exist in the source.
When facing a problem like this try a step by step approach. First select the first node in your xpath Ie start with HtmlNode htest = document.DocumentNode.SelectSingleNode("//*[@id='appbar']");
and it that returns null (which in will in this case) you've found where the error is. Then correct the error and either try the full xpath again if you feel confident that the rest is ok. If you then get an error again take it back to the second element Ie. //*[@id='appbar']/div
and progress like this untill you get the element you want
来源:https://stackoverflow.com/questions/10993199/returns-a-null-from-html-node