Obtain child anchor element within WebBrowser control

廉价感情. 提交于 2019-12-10 16:59:56

问题


Preamble

I'm using the WebBrowser control, which a user will interact with, so a solution will need to work with a visible WebBrowser control.

Question

How do I check if an element has an anchor as a child? All browsers are able to distinguish that an element contains an anchor (<a href=""...), and offers "open in new tab" functionality. That is what I am attempting to replicate. However, when I right click on a HtmlElement I'm only able to obtain the parent element.

Example

Taking the BBC website as an example, when I right click on the highlighted element (picture below), my output is DIV, but viewing the source code there is an anchor element as a child of this div.

SSCCE

using System;
using System.Diagnostics;
using System.Windows.Forms;

namespace BrowserLinkClick
{
    public partial class Form1 : Form
    {
        private WebBrowser wb;
        private bool firstLoad = true;

        public Form1()
        {
            InitializeComponent();
        }

        private void Form1_Load(object sender, EventArgs e)
        {
            wb = new WebBrowser();
            wb.Dock = DockStyle.Fill;
            Controls.Add(wb);
            wb.Navigate("http://bbc.co.uk");
            wb.DocumentCompleted += wb_DocumentCompleted;
        }

        private void Document_MouseDown(object sender, HtmlElementEventArgs e)
        {
            if (e.MouseButtonsPressed == MouseButtons.Right)
            {
                HtmlElement element = wb.Document.GetElementFromPoint(PointToClient(MousePosition));
                //I assume I need to check if this element has child elements that contain a TagName "A"
                if (element.TagName == "A")
                    Debug.WriteLine("Get link location, open in new tab.");
                else
                    Debug.WriteLine(element.TagName);
            }
        }


        private void wb_DocumentCompleted(object sender, WebBrowserDocumentCompletedEventArgs e)
        {
            if (firstLoad)
            {
                wb.Document.MouseDown += new HtmlElementEventHandler(Document_MouseDown);
                firstLoad = false;
            }
        }

    }
}

Please test any proposed solution using the BBC website and the highlighted headline (the headline changes, but the DOM remains the same).


回答1:


You have to get the child elements of element before checking if it's an anchor:

HtmlElement element = wb.Document.GetElementFromPoint(PointToClient(MousePosition));
foreach (HtmlElement child in element.Children)
{
    if (child.TagName == "A")
        Debug.WriteLine("Get link location, open in new tab.");
}



回答2:


To access the needed properties you need to cast the HtmlElement to one of the unmanaged MSHTML interfaces, e.g. IHTMLAnchorElement

You have to add Microsoft HTML Object Library COM reference to your project.
(The file name is mshtml.tlb.)

foreach (HtmlElement child in element.Children)
{
    if (String.Equals(child.TagName, "a", StringComparison.OrdinalIgnoreCase))
    {
        var anchorElement = (mshtml.IHTMLAnchorElement)child.DomElement;
        Console.WriteLine("href: [{0}]", anchorElement.href);
    }
}

There are plenty of such interfaces but MSDN will help you choose. :)

Scripting Object Interfaces (MSHTML)




回答3:


I propose you the following solution:
url variable will have url of your desired output, you'll be able to see it in debugger window.

private void Document_MouseDown(object sender, HtmlElementEventArgs e)
{
        if (e.MouseButtonsPressed == MouseButtons.Right)
        {
            HtmlElement element = wb.Document.GetElementFromPoint(PointToClient(MousePosition));
            //I assume I need to check if this element has child elements that contain a TagName "A"
            if (element.TagName == "A")
            {
                Debug.WriteLine("Get link location, open in new tab.");
                var urlRaw = element.OuterHtml;
                string hrefBegin = "href=";
                var idxHref = urlRaw.IndexOf(hrefBegin) + hrefBegin.Length + 1;
                var idxEnd = urlRaw.IndexOf("\"", idxHref + 1);
                var url = urlRaw.Substring(idxHref, idxEnd - idxHref);
                Debug.WriteLine(url);
            }

            else
                Debug.WriteLine(element.TagName);
        }
    }



回答4:


There has to be something else wrong with your program. On the BBC website your code works for the news articles (although I see the non UK version of the site). On other websites where there are anchor elements as children the code below works

 private void Document_MouseDown(object sender, HtmlElementEventArgs e)
    {
        if (e.MouseButtonsPressed == MouseButtons.Right)
        {
            HtmlElement element = wb.Document.GetElementFromPoint(PointToClient(MousePosition));
            if (element.Children.Count > 0)
            {
                foreach (HtmlElement child in element.Children)
                {
                    if (child.TagName == "A")
                        Debug.WriteLine("Get link location, open in new tab.");
                }
            }
            else
            {
                //I assume I need to check if this element has child elements that contain a TagName "A"
                if (element.TagName == "A")
                    Debug.WriteLine("Get link location, open in new tab.");
                else
                    Debug.WriteLine(element.TagName);
            }
        }
    }



回答5:


The challenge with bbc web site, that it have little bit non standard approach toward their url. Below goes one of the samples of their a href:

<A tabIndex=-1 aria-hidden=true class=block-link__overlay-link href="http://www.bbc.com/news/world-africa-36132482" rev=hero5|overlay>Barbie challenges the 'white saviour complex' </A>

so, you need to use two parts in if:
1. element.TagName == "A" 2. check attribute href like this: element.GetAttribute("href")

Those two checks can give you guaranty that you deal with something with tag a, and that tag a has attribute href. See another example of usage:

private void Document_MouseDown(object sender, HtmlElementEventArgs e)
    {
        if (e.MouseButtonsPressed == MouseButtons.Right)
        {
            HtmlElement element = wb.Document.GetElementFromPoint(PointToClient(MousePosition));
            //I assume I need to check if this element has child elements that contain a TagName "A"
            if (element.TagName == "A" && !string.IsNullOrEmpty(element.GetAttribute("href")))//it means we have deal with href
            {
                Debug.WriteLine("Get link location, open in new tab.");
                var url = element.GetAttribute("href");
                Debug.WriteLine(url);
            }

            else
                Debug.WriteLine(element.TagName);
        }
    }


来源:https://stackoverflow.com/questions/36767938/obtain-child-anchor-element-within-webbrowser-control

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!