问题
I wrote some code in VB.Net a while ago that is using XElement, XDocument, etc... to store and manipulate HTML. Some of the HTML makes use of attribute names that contain a hyphen/dash (-). I encountered issues using LinqToXml to search for XElements by these attributes.
Back then I found an article (can't find it now) that indicated the solution in VB.net was to use syntax like this:
Dim rootElement as XElement = GetARootXElement()
Dim query = From p In rootElement.<div> Where p.@<data-qid> = 5 Select p
The "magic" syntax is the @<> which somehow translates the hyphenated attribute name into a format that can be successfully used by Linq. This code works great in VB.Net.
The problem is that we have now converted all the VB.Net code to C# and the conversion utility choked on this syntax. I can't find anything about this "magic" syntax in VB.Net and so I was hoping someone could fill in the details for me, specifically, what the C# equivalent is. Thanks.
Here is an example:
<div id='stuff'>
<div id='stuff2'>
<div id='stuff' data-qid=5>
<!-- more html -->
</div>
</div>
</div>
In my code above the rootElement would be the stuff div and I would want to search for the inner div with the attribuate data-qid=5.
回答1:
I can get the following to compile in C# - I think it's equivalent to the original VB (note that the original VB had Option Strict Off):
XElement rootElement = GetARootXElement();
var query = from p in rootElement.Elements("div")
where p.Attribute("data-qid").Value == 5.ToString()
select p;
Here's my (revised) test, which finds the div with the 'data-qid' attribute:
var xml = System.Xml.Linq.XElement.Parse("<div id='stuff'><div id='stuff2'><div id='stuff3' data-qid='5'><!-- more html --></div></div></div>");
var rootElement = xml.Element("div");
var query = from p in rootElement.Elements("div")
where p.Attribute("data-qid").Value == 5.ToString()
select p;
回答2:
Use HtmlAgilityPack (available from NuGet) to parse HTML. Here is an example:
HtmlDocument doc = new HtmlDocument();
doc.Load("index.html");
var innerDiv =
doc.DocumentNode.SelectSingleNode("//div[@id='stuff']/*/div[@data-qid=5]");
This XPath query gets inner div
tag which has data-qid
equal to 5
. Also outer div
should have id equal to 'stuff'
. And here is the way to get data-qid
attribute value:
var qid = innerDiv.Attributes["data-qid"].Value; // 5
回答3:
Instead of using HtmlAgilityPack offered by Sergey Berezovskiy, there's easier way to do without it by using XmlPath's Extensions class, containing extension methods to work with LINQ to XML:
using System.Xml.XPath;
var xml = XElement.Parse(html);
var innderDiv = xml.XPathSelectElement("//div[@id='stuff' and @data-qid=5]");
来源:https://stackoverflow.com/questions/17202306/searching-for-xelement-with-attribute-name-that-contain-hyphens-dashes