HtmlElement doesn't parse the tag properly

拟墨画扇 提交于 2019-12-11 06:19:21

问题


I have the following line in my html source:

<input class="phone" name="site_url" type="text" placeholder="Enter Your Website URL">

When I navigate using WebBrowser Control (C#) and load my site to an HtmlDocument object, and then loop over each HtmlElement, when I get to the input element above:

I can't get the placeholder attribute. GetAttribute("placeholder") returns "". I checked the OuterHtml/InnerHtml fields and noted that placeholder attribute is copied with "" while other attributes are not, moreover, I can retrieve other attributes (name, class).

This is the output of InnerHtml/OuterHtml:

<INPUT class=phone placeholder="Enter Your Website URL" name=site_url>

Can anybody explain why is that and how can I change placeholder in this case?


回答1:


By default, WebBrowser control runs in IE7 compatibility mode. In that mode, placeholder attribute is not supported. Thus, first you need to switch it into IE10 mode, here's how. Then, you would need to call unmanaged getAttributeNode and get its value, here's how:

bool FindElementWithPlaceholder(HtmlElement root, string placeholder, ref HtmlElement found, ref string value)
{
    foreach (var child in root.Children)
    {
        var childElement = (HtmlElement)child;
        dynamic domElement = childElement.DomElement;
        dynamic attrNode = domElement.getAttributeNode(placeholder);
        if (attrNode != null)
        {
            string v = attrNode.value;
            if (!String.IsNullOrWhiteSpace(v))
            {
                value = v;
                found = childElement;
                return true;
            }
        }
        if (FindElementWithPlaceholder(childElement, placeholder, ref found, ref value))
            return true;
    }
    return false;
}

// ...

HtmlElement element = null;
string value = null;
if (FindElementWithPlaceholder(this.WB.Document.Body, "placeholder", ref element, ref value))
    MessageBox.Show(value);

This code has been tested with IE10.

[EDITED] You can still retrieve the value of placeholder with the above code, even if WebBrowser Feature Control is not emplemented. However, placeholder won't function visually in such case, because the document won't be in HTML5 mode.

[EDITED] Perhaps, I finally understand what you want. Try this code and see if it does that. You still need the Feature Control and DOCTYPE to enable HTML5.

HTML: <!doctype html><html><input class=phone placeholder=\"Enter Your Website URL\" name=site_url></html>

HtmlElement element = null;
string oldValue = null;
string newValue = "New Value";
FindElementWithPlaceholder(this.webBrowser1.Document.Body, "placeholder", ref element, ref value, newValue);

bool FindElementWithPlaceholder(HtmlElement root, string placeholder, ref HtmlElement found, ref string oldValue, string newValue)
{
    foreach (var child in root.Children)
    {
        var childElement = (HtmlElement)child;
        dynamic domElement = childElement.DomElement;
        dynamic attrNode = domElement.getAttributeNode(placeholder);
        if (attrNode != null)
        {
            string v = attrNode.value;
            if (!String.IsNullOrWhiteSpace(v))
            {
                domElement.removeAttributeNode(attrNode);
                domElement.setAttribute(placeholder, newValue);
                // a hack to make IE10 to render the new placeholder  
                var id = domElement.getAttribute("id");
                var uniqueId = Guid.NewGuid().ToString();
                domElement.setAttribute("id", uniqueId);
                var html = domElement.outerHTML;
                domElement.outerHTML = html;
                var newElement = root.Document.GetElementById(uniqueId);
                domElement = newElement.DomElement;
                if (String.IsNullOrEmpty(id))
                    domElement.removeAttribute("id");
                else
                    domElement.setAttribute("id", id);
                found = newElement;
                oldValue = v;
                return true;
            }
        }
        if (FindElementWithPlaceholder(childElement, placeholder, ref found, ref oldValue, newValue))
            return true;
    }
    return false;
}



回答2:


HtmlElement exposes only those attributes that are common to all elements, leaving out those that only apply to certain types of elements;

HtmlElement.GetAttribute is identical to IHTMLElement::getAttribute(strAttributeName, 0)

There are some change on how getAttribute is working related to Internet Explorer 8, check Remarks section. To resolve this you can do manual parsing for InnerHtml to extract that custom placeholder attribute.



来源:https://stackoverflow.com/questions/18445962/htmlelement-doesnt-parse-the-tag-properly

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!