HTML find and replace href tags [duplicate]

问题

Possible Duplicate:
What is the best way to parse html in C#?

I am parsing an HTML file. I need find all the href tags in an html and replace them with a text friendly version.

Here is an example.

Original Text: <a href="http://foo.bar">click here</a> 
replacement value: click here <http://foo.bar>

How do I achieve this?

回答1:

You could use the Html Agility Pack library, with a code like this:

        HtmlDocument doc = new HtmlDocument();
        doc.Load(myHtmlFile); // load your file

        // select recursively all A elements declaring an HREF attribute.
        foreach (HtmlNode node in doc.DocumentNode.SelectNodes("//a[@href]"))
        {
            node.ParentNode.ReplaceChild(doc.CreateTextNode(node.InnerText + " <" + node.GetAttributeValue("href", null) + ">"), node);
        }

        doc.Save(Console.Out); // output the new doc.

来源：https://stackoverflow.com/questions/13126238/html-find-and-replace-href-tags

标签

html

html-parsing

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!