How to scrape xml file using htmlagilitypack

问题

I need to scrape an xml file from http://feeds.feedburner.com/Torrentfreak for its links and description.

I used this code :

    var webGet = new HtmlWeb();
                var document = webGet.Load("http://feeds.feedburner.com/TechCrunch");
    var TechCrunch = from info in document.DocumentNode.SelectNodes("//channel")
                                 from link in info.SelectNodes("//guid[@isPermaLink='false']")
                                 from content in info.SelectNodes("//description")
     select new
                                 {
                                     LinkURL = info.InnerText,
                                     Content = content.InnerText,

                                 };
lvLinks.DataSource = TechCrunch;
            lvLinks.DataBind();

I have used this in list view control to show on asp.net page. using

<%# Eval("LinkURL") %>  -  <%# Eval("Text") %>

But its showing error

Value cannot be null. Parameter name: source

what's the problem ? And is it possible to scrape (fetch) xml nodes data using HtmlAgilityPack ? Please suggest Thanks

回答1:

Try using RSS library instead of the HtmlAgilityPack:

Here are some links that might help you:

http://www.rssdotnet.com/
http://www.yetanotherchris.me/home/2010/2/8/simplified-c-atom-and-rss-feed-parser.html

回答2:

The error says that the value is null. So there is too possibly's

select new
         {
                LinkURL = info.InnerText??string.Empty,
                Content = content.InnerText??string.Empty,

         };

or in the aspx. I think that it should be minus in the string like this:

<%# Eval("LinkURL")??string.Empty %>+"-"+<%# Eval("Text")??string.Empty %>

来源：https://stackoverflow.com/questions/9108644/how-to-scrape-xml-file-using-htmlagilitypack

标签

ASP.NET

screen-scraping

html-agility-pack

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!