html-agility-pack

htmlagilitypack getting an element's node by the name

試著忘記壹切 提交于 2021-01-28 12:24:32
问题 How can I get the node of an element by the name. There is GetElementById, why no GetElementByName. The element in question is: <select class="box1" name="DAY" tabindex="31"> … </select> I want to be able to get this node. But I have no idea how. Pete: please remove that this question has been answered. It is totally WRONG go try it yourself. the node.Name is not the name of the attribute 'name' its the tagname which is NOT what I need. 回答1: You are not accessing the node attribute called

HTML Agility Pack Screen Scraping XPATH isn't returning data

懵懂的女人 提交于 2021-01-28 11:36:02
问题 I'm attempting to write a screen scraper for Digikey that will allow our company to keep accurate track of pricing, part availability and product replacements when a part is discontinued. There seems to be a discrepancy between the XPATH that I'm seeing in Chrome Devtools as well as Firebug on Firefox and what my C# program is seeing. The page that I'm scraping currently is http://search.digikey.com/scripts/DkSearch/dksus.dll?Detail&name=296-12602-1-ND The code I'm currently using is pretty

Html Agility Pack how to get dynamically generated content after page loads

こ雲淡風輕ζ 提交于 2021-01-28 05:50:45
问题 I am attempting to get information from "https://www.sideshow.com/collectibles?manufacturer=Hot+Toys" specifically Div c-ProductList row ss-targeted but no information seems to be retrieved, any clues var test = page.DocumentNode.SelectNodes("//div[@class='c-ProductList row ss-targeted']"); 回答1: The content you want to get is generated after the page loads, using Javascript and Ajax. HAP cannot get it unless it runs a browser in background and execute the scripts on the page. .Net Core 2.0

selecting Node does not work using HtmlAgilityPack

吃可爱长大的小学妹 提交于 2021-01-27 06:39:35
问题 I am using VS2010 and using HTMLAGilityPack1.4.6 (from Net40-folder). Following is my HTML <html> <body> <div id="header"> <h2 id="hd1"> Patient Name </h2> </div> </body> </html> I am using following code in C# to access "hd1". Please tell me correct way to do it. HtmlAgilityPack.HtmlDocument htmlDoc = new HtmlAgilityPack.HtmlDocument(); try { string filePath = "E:\\file1.htm"; htmlDoc.LoadHtml(filePath); if (htmlDoc.DocumentNode != null) { HtmlNodeCollection _hdPatient = htmlDoc.DocumentNode