xpath | 易学教程

Crawl and Concatenate in Scrapy

阅读更多关于 Crawl and Concatenate in Scrapy

问题 I'm trying to crawl movie list with Scrapy (I take only the Director & Movie title fields). Sometimes, there are two directors and Scrapy scape them as different. So the first director will be alon the movie title but for the second there will be no movie title. So I created a condition like this : if director2: item['director'] = map(unicode.strip,titres.xpath("tbody/tr/td/div/div[2]/div[3]/div[2]/div/h2/div/a/text()").extract()) The last div[2] exists only if there are two directors. And I

XPath find all matches C# XmlDocument

阅读更多关于 XPath find all matches C# XmlDocument

问题 I am trying to figure out how to find all matches of string in a XmlDocument . XmlNodeList results = document.SelectNodes("Products/Product/fn:matches(.,'" + SearchWord + "')"); Im trying to compare the innerText of Product. The above example don't work though, but I guess my way of using XPath functions are very wrong. 回答1: Evaluate this XPath 1.0 expression (did you know matches() is an XPath 2.0 function and isn't supported in .NET): Products/Product/descendant::*[contains(text(),

Selenium CSS selector for nth occurrence of td span:nth-child(2)

阅读更多关于 Selenium CSS selector for nth occurrence of td span:nth-child(2)

问题 The css selector td span:nth-child(2) means the 2nd child node span of td . I wanna choose the nth td span:nth-child(2) , something like: driver.find_element_by_css_selector("td span:nth-child(2):eq(4)") I know I can use driver.find_elements_by_css_selector("td span:nth-child(2)")[4] or xpath instead: driver.find_elements_by_xpath('(//td/span[2])[4]') I just wanna know if I can do the same thing with css selector. 回答1: You can't do this with a CSS selector. :eq() is from jQuery and not part

Selenium CSS selector for nth occurrence of td span:nth-child(2)

阅读更多关于 Selenium CSS selector for nth occurrence of td span:nth-child(2)

Xpath test for ancestor attribute not equal string

阅读更多关于 Xpath test for ancestor attribute not equal string

问题 I'm trying to test if an attribute on an ancestor of an element not equal a string. Here is my XML... <aaa att="xyz"> <bbb> <ccc/> </bbb> </aaa> <aaa att="mno"> <bbb> <ccc/> </bbb> </aaa> If I'm acting on element ccc, I'm trying to test that its grandparent aaa @att doesn't equal "xyz". I currently have this... ancestor::aaa[not(contains(@att, 'xyz'))] Thanks! 回答1: Assuming that by saying an ancestor of an element you're referring to an element with child elements, this XPath expression

Xpath test for ancestor attribute not equal string

阅读更多关于 Xpath test for ancestor attribute not equal string

Xpath test for ancestor attribute not equal string

阅读更多关于 Xpath test for ancestor attribute not equal string

xpath - using contains with a wildcard

阅读更多关于 xpath - using contains with a wildcard

问题 I have the following, and trying to see if there's a better approach. I know it cn be done using starts-with/contains. I'm testing with firefox 10, which I believe implements xpath 2.+. Test node is <a id="foo"> . . . <a id="foo1"> . <a id="foo2"> Is there a way to use wildcards to be able to get the foo1/foo2 nodes.. Something like //a[@id =* 'foo'] or //a[contains(@id*,'foo')] Which would say, give me the "a" where the id starts with "foo" but has additional chars... This would then skip

double slash for xpath. Selenium Java Webdriver

阅读更多关于 double slash for xpath. Selenium Java Webdriver

问题 I am using Selenium WebDriver. I have a doubt about the xpath. If I have the following code example: <div> <div> <div> <a> <div> </div> </a> </div> </div> </div> And I want to locate the element which is in the last <div> . I think I have 2 options with the xpath. First option is with single slash: driver.findElement(By.xpath("/div/div/div/a/div")).click(); Second option is using double slash (and here is where I have the doubt). driver.findElement(By.xpath("//a/div")).click(); Is it going to

Scrapy - select xpath with a regular expression

阅读更多关于 Scrapy - select xpath with a regular expression

问题 Part of the html that I am scraping looks like this: <h2> <span class="headline" id="Profile">Profile</span></h2> <ul><li> <b>Name</b> Albert Einstein </li><li> <b>Birth Name:</b> Alberto Ein </li><li> <b>Birthdate:</b> December 24, 1986 </li><li> <b>Birthplace:</b> <a href="/Ulm" title="Dest">Ulm</a>, Germany </li><li> <b>Height:</b> 178cm </li><li> <b>Blood Type:</b> A </li></ul> I want to extract each component - so name, birth name, birthday, etc. To extract the name I do: a_name =