C# WPF Webbrowser msHTML - Explore DOM - Find Elements

会有一股神秘感。 提交于 2019-12-11 02:31:24

问题


I'm actually working on a personal project in C# using WPF and WPF WebBrowser. I really need to explore html DOM Elements as we used to do in javascript or php..etc

In my MainWindow I have this variable :

private mshtml.HTMLDocument mainDocument = new mshtml.HTMLDocument();

In my webBrowser LoadComplete callback I have this :

mainDocument = (mshtml.HTMLDocument) mainBrowser.Document;

Ok, so this is nice, it's working.

Now if I do this :

mshtml.IHTMLElement elem = mainDocument.getElementById("MY_ID");

it's also very nice, can do elem.innerHTML or somes stuff like that.

BUT my problem is only HTMLDocument have methodes to find elements by ID, by tagnames..etc

I don't know how to find elements in IHTMLElement. I tried some stuff like casting IHTMLElement to IHTMLElement2..etc but nothing have worked.

Please if you have any ideas. A lot of people talks about hosting winforms webbrowser but I think it must have a way to do that only with mshtml.

Thanks a lot, If you need more information, please feel free to ask me

ps : I'm french so I'm sorry about my Engish skills


回答1:


If you want to parse HTML document in Winforms or wpf, you can use an excellent parser htmlagility pack. Refer to below link http://html-agility-pack.net

  var url = "http://html-agility-pack.net/";
 var web = new HtmlWeb();
 var doc = web.Load(url);

After loading it in doc, you can get any attribute, tag, etc.

 var value = doc.DocumentNode
.SelectNodes("//td/input")
.First()
.Attributes["value"].Value;

It's super easy, just explore the doc a bit and you can make full use of it.

You can load html agility pack even from webbrowser, like below

HtmlAgilityPack.HtmlDocument doc = new 
HtmlAgilityPack.HtmlDocument();
            doc.Load(webBrowser1.DocumentStream);

Or you can do like this

HtmlAgilityPack.HtmlDocument doc = new 
HtmlAgilityPack.HtmlDocument();
            doc.Load(webBrowser1.Document);

Thanks




回答2:


Thanks a lot @Sujit for your help. I've not enouth reputation to mark your answer as helpful but I hope others will do.

To get it work with wpf webbrowser I've done :

mainHTMLDoc.LoadHtml((mainBrowser.Document as mshtml.HTMLDocument).documentElement.innerHTML);

To manipulate everything in should use this :

using System.Linq;

After that you can do stuffs like that :

var table = mainHTMLDoc.GetElementbyId("MyID");
var rows = table.Element("tbody").Elements("tr");
for(int i=0; i< rows.Count();i++) {
    var datacol1 = rows.ElementAt(i).Elements("td").ElementAt(0).Descendants("a").ElementAt(0).InnerHtml;
    var datacol2 = rows.ElementAt(i).Elements("td").ElementAt(1).InnerText 
}

Whitout using Linq you cannot use Elements function which are very very usefull ! Thanks again Sujit :)



来源:https://stackoverflow.com/questions/44918745/c-sharp-wpf-webbrowser-mshtml-explore-dom-find-elements

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!