xpath

Scrapy repeating rows

邮差的信 提交于 2021-01-29 07:21:08
问题 I'm trying to scrape through this site https://www.tahko.com/fi/menovinkit/?ql=tapahtumat. In particular, I'm trying to scrape through the 3 tables on the site. I've managed this with tables = response.xpath('//*[@class="table table-stripefd"]') Then I'd like to get each of the rows for the table, which I did with rows = tables.xpath('//tr') The problem here is, that after scraping and printing out some of the data I noticed that there are multiple entries for some rows. For example, the

How to use Select-Xml in Powershell to print the content of an xml node element?

…衆ロ難τιáo~ 提交于 2021-01-29 06:00:39
问题 restricted to xpath or even Select-Xml how else are the book titles printed? PS /home/nicholas/powershell> PS /home/nicholas/powershell> Select-Xml "./bookstore.xml" -XPath "/bookstore/book/title" | foreach {$_.node.InnerXML} Pride And Prejudice The Handmaid's Tale Emma Sense and Sensibility PS /home/nicholas/powershell> PS /home/nicholas/powershell> Select-Xml -Path "./bookstore.xml" cmdlet Select-Xml at command pipeline position 1 Supply values for the following parameters: XPath:

How do I extract data from dynamic updating webpages

徘徊边缘 提交于 2021-01-29 05:50:33
问题 I want to scrape the review from Sephora website. The review is dynamically updated. After inspection I found the review is here in the HTML code. <div class="css-eq4i08 " data-comp="Ellipsis Box">Honestly I never write reviews but this is a must if you have frizzy after even after straightening it! It smells fantastic and it works wonders definitely will be restocking once I’m done this one !!</div> I want to write a python selenium code to read the review. The code I wrote is here... from

UnexpectedTagNameException: Message: Select only works on <select> elements, not on <li>error selecting li element from a Dropdown using Selenium

狂风中的少年 提交于 2021-01-29 04:49:59
问题 I wish to click on New Test. The HTML code looks something like this. I'm new here and beginning to learn automation using selenium-python. <li id="testing"> <ul class="dd"> <li><a href="javascript:toolsPopup('/abc/xyz/text.html');"><span>New Test</span></a></li> <li><a href="javascript:toolsPopup('/abc/xyz/list.html');"><span>Test List</span></a></li> </ul> </li> The code that I'm trying to use element=driver.find_element_by_id('testing') drp=Select(element) drp.select_by_visible_text('New

Using a variable as part of a XPath selection

房东的猫 提交于 2021-01-29 04:19:53
问题 I'm looking to use a variable as part of an XPath expression. My problem might be the msxsl node-set function... not sure. But don't let that cloud your judgement... read on... I'm using a .NET function to load up the content file, which is passed in via bespoke XML content. The @file results in an XML file. The bespoke XML the sits on the page looks like : <control name="import" file="information.xml" node="r:container/r:group[@id='set01']/r:item[1]" /> The XSL looks like : <xsl:variable

extract XML attributes and node values R

主宰稳场 提交于 2021-01-29 02:26:48
问题 I have an XML file in R. The XML file looks like this: rootNode <- xmlRoot(xmlfile) rootNode[[1]] <pdv id="1000001" latitude="4620114" longitude="519791" cp="01000" pop="R"> <adresse>ROUTE NATIONALE</adresse> <ville>SAINT-DENIS-LèS-BOURG</ville> <ouverture debut="01:00" fin="01:00" saufjour=""/> <services> <service>Automate CB</service> <service>Vente de gaz domestique</service> <service>Station de gonflage</service> </services> <prix nom="Gazole" id="1" maj="2014-01-02 11:08:03" valeur=

xpath to get data starts with specific character or string

别说谁变了你拦得住时间么 提交于 2021-01-28 20:15:51
问题 I need to extract certain text elements from the following code. <div class="inhalt-links"> <h2> Deutsche Verkehrswacht <br> Verkehrswacht Dortmund e. V. <br> </h2> <h3> Standnummer:  <span style="font-weight: normal;">4.E08</span> </h3> <div class="clear"></div> <br> Benediktinerstraße 82 <br> 44287 Dortmund <br> Deutschland <br> <br> Tel.:+49 231 447687 <br> Fax:+49 231 447136 <br> E-Mail:info@verkehrswacht-dortmund.de <br> <a href="http://www.verkehrswacht-dortmund.de" class="url" target="

Python Selenium - best way to capture element (Xpath or CSS selector) in this case (an input box)?

為{幸葍}努か 提交于 2021-01-28 19:50:57
问题 I am trying to scrape a website with selenium and I am using mostly xpath or CSS selector to grab elements. However I am noticing that these are dynamic (even though I read online that CSS selector shouldnt be) and I am having to re write the code often. I am fairly new to this and would like help figuring out what would be the best way to do this. Below is an example of an element that is an input box that I am trying to grab, I understand more definitive selectors like ID are more robust to

How to ignore namespaces in XPath 1.0?

冷暖自知 提交于 2021-01-28 19:35:08
问题 I'm having serious issues trying to understand the magic that is XPath. Basically, I have some XML like so: <a> <b> <c/> </b> </a> Now, I want to count how many B's we have, without C's. This can be done easily with the following XPath: count(*/b[not(descendant::c)]) Now the question is this simple: How do I do the same thing, while ignoring any namespaces? I'd imagine it was something like this? count(*/[local-name()='b']/[not(descendant::[local-name()='c'])]) But this is not correct. What

How to get parent element text and remove child element text selenium c#?

冷暖自知 提交于 2021-01-28 19:09:43
问题 i'm a newbie of automation testing, now i'm using selenium C#. I have a problem, i want to get text from a element, but the code : <div class="item"> Text1 <span>textdontwant</span> </div> and my statement driver.FindElement(By.XPath("//*[@id='contactList']/div[1]/div/div[1]/div/div/div/div[2]")).Text; Get: Text1textdontwant I need only parent element text is : Text1 Anyone have solution? Thanks so much! here HTML: <div id="contactList" class="web z-data-list" tabindex="30" style="position: