xpath | 易学教程

Scrapy repeating rows

阅读更多关于 Scrapy repeating rows

问题 I'm trying to scrape through this site https://www.tahko.com/fi/menovinkit/?ql=tapahtumat. In particular, I'm trying to scrape through the 3 tables on the site. I've managed this with tables = response.xpath('//*[@class="table table-stripefd"]') Then I'd like to get each of the rows for the table, which I did with rows = tables.xpath('//tr') The problem here is, that after scraping and printing out some of the data I noticed that there are multiple entries for some rows. For example, the

How to use Select-Xml in Powershell to print the content of an xml node element?

阅读更多关于 How to use Select-Xml in Powershell to print the content of an xml node element?

问题 restricted to xpath or even Select-Xml how else are the book titles printed? PS /home/nicholas/powershell> PS /home/nicholas/powershell> Select-Xml "./bookstore.xml" -XPath "/bookstore/book/title" | foreach {$_.node.InnerXML} Pride And Prejudice The Handmaid's Tale Emma Sense and Sensibility PS /home/nicholas/powershell> PS /home/nicholas/powershell> Select-Xml -Path "./bookstore.xml" cmdlet Select-Xml at command pipeline position 1 Supply values for the following parameters: XPath:

How do I extract data from dynamic updating webpages

阅读更多关于 How do I extract data from dynamic updating webpages

问题 I want to scrape the review from Sephora website. The review is dynamically updated. After inspection I found the review is here in the HTML code. <div class="css-eq4i08 " data-comp="Ellipsis Box">Honestly I never write reviews but this is a must if you have frizzy after even after straightening it! It smells fantastic and it works wonders definitely will be restocking once I’m done this one !!</div> I want to write a python selenium code to read the review. The code I wrote is here... from

UnexpectedTagNameException: Message: Select only works on <select> elements, not on <li>error selecting li element from a Dropdown using Selenium

阅读更多关于 UnexpectedTagNameException: Message: Select only works on elements, not on error selecting li element from a Dropdown using Selenium

问题 I wish to click on New Test. The HTML code looks something like this. I'm new here and beginning to learn automation using selenium-python. <li id="testing"> <ul class="dd"> <li><a href="javascript:toolsPopup('/abc/xyz/text.html');"><span>New Test</span></a></li> <li><a href="javascript:toolsPopup('/abc/xyz/list.html');"><span>Test List</span></a></li> </ul> </li> The code that I'm trying to use element=driver.find_element_by_id('testing') drp=Select(element) drp.select_by_visible_text('New

Using a variable as part of a XPath selection

阅读更多关于 Using a variable as part of a XPath selection

问题 I'm looking to use a variable as part of an XPath expression. My problem might be the msxsl node-set function... not sure. But don't let that cloud your judgement... read on... I'm using a .NET function to load up the content file, which is passed in via bespoke XML content. The @file results in an XML file. The bespoke XML the sits on the page looks like : <control name="import" file="information.xml" node="r:container/r:group[@id='set01']/r:item[1]" /> The XSL looks like : <xsl:variable

extract XML attributes and node values R

阅读更多关于 extract XML attributes and node values R

问题 I have an XML file in R. The XML file looks like this: rootNode <- xmlRoot(xmlfile) rootNode[[1]] <pdv id="1000001" latitude="4620114" longitude="519791" cp="01000" pop="R"> <adresse>ROUTE NATIONALE</adresse> <ville>SAINT-DENIS-LÃ¨S-BOURG</ville> <ouverture debut="01:00" fin="01:00" saufjour=""/> <services> <service>Automate CB</service> <service>Vente de gaz domestique</service> <service>Station de gonflage</service> </services> <prix nom="Gazole" id="1" maj="2014-01-02 11:08:03" valeur=

xpath to get data starts with specific character or string

阅读更多关于 xpath to get data starts with specific character or string

问题 I need to extract certain text elements from the following code. <div class="inhalt-links"> <h2> Deutsche Verkehrswacht <br> Verkehrswacht Dortmund e. V. <br> </h2> <h3> Standnummer: <span style="font-weight: normal;">4.E08</span> </h3> <div class="clear"></div> <br> Benediktinerstraße 82 <br> 44287 Dortmund <br> Deutschland <br> <br> Tel.:+49 231 447687 <br> Fax:+49 231 447136 <br> E-Mail:info@verkehrswacht-dortmund.de <br> <a href="http://www.verkehrswacht-dortmund.de" class="url" target="

Python Selenium - best way to capture element (Xpath or CSS selector) in this case (an input box)?

阅读更多关于 Python Selenium - best way to capture element (Xpath or CSS selector) in this case (an input box)?

问题 I am trying to scrape a website with selenium and I am using mostly xpath or CSS selector to grab elements. However I am noticing that these are dynamic (even though I read online that CSS selector shouldnt be) and I am having to re write the code often. I am fairly new to this and would like help figuring out what would be the best way to do this. Below is an example of an element that is an input box that I am trying to grab, I understand more definitive selectors like ID are more robust to

How to ignore namespaces in XPath 1.0?

阅读更多关于 How to ignore namespaces in XPath 1.0?

问题 I'm having serious issues trying to understand the magic that is XPath. Basically, I have some XML like so: <a> <b> <c/> </b> </a> Now, I want to count how many B's we have, without C's. This can be done easily with the following XPath: count(*/b[not(descendant::c)]) Now the question is this simple: How do I do the same thing, while ignoring any namespaces? I'd imagine it was something like this? count(*/[local-name()='b']/[not(descendant::[local-name()='c'])]) But this is not correct. What

How to get parent element text and remove child element text selenium c#?

阅读更多关于 How to get parent element text and remove child element text selenium c#?

问题 i'm a newbie of automation testing, now i'm using selenium C#. I have a problem, i want to get text from a element, but the code : <div class="item"> Text1 <span>textdontwant</span> </div> and my statement driver.FindElement(By.XPath("//*[@id='contactList']/div[1]/div/div[1]/div/div/div/div[2]")).Text; Get: Text1textdontwant I need only parent element text is : Text1 Anyone have solution? Thanks so much! here HTML: <div id="contactList" class="web z-data-list" tabindex="30" style="position: