domxpath

Get full XPath from partial

落爺英雄遲暮 提交于 2019-12-07 13:36:00
问题 I am using selenium with perl and have label on page, to access this label i have following xpath: //*[text()='some here'] , the problem that a need to get full xpath of this element, like /html/body/table/tr/..../any other/and other/ , is there is any selenium method or perl function ? looking for perl solution or any other working things. thanks 回答1: looking for perl solution or any other working things This XPath 2.0 expression: string-join(for $node in ancestor-or-self::node() return

PHP + Wikipedia: Get content from the first paragraph in a Wikipedia article?

て烟熏妆下的殇ゞ 提交于 2019-12-06 09:25:50
I’m trying to use Wikipedia’s API (api.php) to get the content of a Wikipedia article provided by a link (like: http://en.wikipedia.org/wiki/Stackoverflow ). And what I want is to get the first paragraph (which in the example of the Stackoverflow wiki article is: Stack Overflow is a website part of the Stack Exchange network[2][3] featuring questions and answers on a wide range of topics in computer programming.[4][5][6] ). I’m going to do some data manipulation with it. I’ve tried with the API url: http://en.wikipedia.org/w/api.php?action=parse&page=Stackoverflow&format=xml but it gives me

Get full XPath from partial

时光怂恿深爱的人放手 提交于 2019-12-05 18:32:25
I am using selenium with perl and have label on page, to access this label i have following xpath: //*[text()='some here'] , the problem that a need to get full xpath of this element, like /html/body/table/tr/..../any other/and other/ , is there is any selenium method or perl function ? looking for perl solution or any other working things. thanks looking for perl solution or any other working things This XPath 2.0 expression: string-join(for $node in ancestor-or-self::node() return concat(('@')[$node/self::attribute()], $node/name(), (concat('[', count($node/preceding-sibling::node() [name()=

Parse anchor tags which have img tag as child element

末鹿安然 提交于 2019-12-05 02:56:34
问题 I need to find all anchor tags, which have an img tag as child element. Consider the following cases, <a href="test1.php"> <img src="test1.jpg" alt="Test 1" /> </a> <a href="test2.php"> <span> <img src="test2.jpg" alt="Test 2" /> </span> </a> My requirement is to generate a list of href attributes along with src and alt ie, $output = array( array( 'href' => 'test1.php', 'src' => 'test1.jpg', 'alt' => 'Test 1' ), array( 'href' => 'test2.php', 'src' => 'test2.jpg', 'alt' => 'Test 2' ) ); How

PHP change DOM useragent

坚强是说给别人听的谎言 提交于 2019-12-04 20:52:22
I have this simple code to get the title of any page <?php $doc = new DOMDocument(); @$doc->loadHTMLFile('http://www.facebook.com'); $xpath = new DOMXPath($doc); echo $xpath->query('//title')->item(0)->nodeValue."\n"; ?> It is working fine on all pages that I have tried but not in Facebook. When I try in Facebook it is not showing Welcome to Facebook - Log In, Sign Up or Learn More , but it is showing Update Your Browser | Facebook . I think there is a problem with useragent. So is there a way to change the useragent or is there any other solution for this? You can set the user agent in php

Parse anchor tags which have img tag as child element

爷,独闯天下 提交于 2019-12-03 20:15:43
I need to find all anchor tags, which have an img tag as child element. Consider the following cases, <a href="test1.php"> <img src="test1.jpg" alt="Test 1" /> </a> <a href="test2.php"> <span> <img src="test2.jpg" alt="Test 2" /> </span> </a> My requirement is to generate a list of href attributes along with src and alt ie, $output = array( array( 'href' => 'test1.php', 'src' => 'test1.jpg', 'alt' => 'Test 1' ), array( 'href' => 'test2.php', 'src' => 'test2.jpg', 'alt' => 'Test 2' ) ); How can I match the above cases in PHP? (Using Dom Xpath or any other dom parser) Thanks in Advance! Assuming

Xpath php fetch links

我的梦境 提交于 2019-12-02 04:10:52
问题 I'm using this example to fetch links from a website : http://www.merchantos.com/makebeta/php/scraping-links-with-php/ $xpath = new DOMXPath($dom); $hrefs = $xpath->evaluate("/html/body//a"); for ($i = 0; $i < $hrefs->length; $i++) { $href = $hrefs->item($i); var_dump($href); $url = $href->getAttribute('href'); echo "<br />Link stored: $url"; } It works well; getting all the links; but I cannot get the actual 'title' of the link; for example if i have : <a href="www.google.com">Google</a> I

Xpath php fetch links

久未见 提交于 2019-12-02 00:06:44
I'm using this example to fetch links from a website : http://www.merchantos.com/makebeta/php/scraping-links-with-php/ $xpath = new DOMXPath($dom); $hrefs = $xpath->evaluate("/html/body//a"); for ($i = 0; $i < $hrefs->length; $i++) { $href = $hrefs->item($i); var_dump($href); $url = $href->getAttribute('href'); echo "<br />Link stored: $url"; } It works well; getting all the links; but I cannot get the actual 'title' of the link; for example if i have : <a href="www.google.com">Google</a> I want to be able to fetch 'Google' term too. I'm little lost and quite new to xpath. Try this: $link

PHP xpath query on XML with default namespace binding

北城以北 提交于 2019-11-30 21:45:21
I have one solution to the subject problem, but it’s a hack and I’m wondering if there’s a better way to do this. Below is a sample XML file and a PHP CLI script that executes an xpath query given as an argument. For this test case, the command line is: ./xpeg "//MainType[@ID=123]" What seems most strange is this line, without which my approach doesn’t work: $result->loadXML($result->saveXML($result)); As far as I know, this simply re-parses the modified XML, and it seems to me that this shouldn’t be necessary. Is there a better way to perform xpath queries on this XML in PHP? XML ( note the

PHP's DOMXPath is stripping out my tags inside the matched text

情到浓时终转凉″ 提交于 2019-11-30 17:57:36
问题 I asked this question yesterday, and at the time it was just what I needed, but while working with some live data I discovered that is wasn't quite doing what I expected. Parse HTML with PHP's HTML DOMDocument It gets the data from the HTML page, but then it also strips out all the HTML tags inside the captured block of text, which isn't what I want. (I might wan't to take some of the tags out, but not all, and this can be done later) 回答1: That's a common problem with DOM : you have to do a