I'm having some issues with XPath and import.io and I hope you'll be able to help me. :)
The html code:
<a href="page.php?var=12345">
For the moment, I manage to extract the content of the href ( page.php?var=12345 ) with this:
./td[3]/a[1]/@href
Though, I would like to just collect: 12345
substring might be the solution but it does not seem to work on import.io as I use it...
substring(./td[3]/a[1]/@href,13)
Any ideas of what the problem is?
Thank's a lot in advance!
Try using this for the xpath: (Have the field selected as Text)
.//*[@class='oeil']/a/@href
Then use this for your regex:
([^=]*)$
This will get you the ISBN number you are looking for.
import.io only support functions in XPath when they return a node list
Your path expression is fine, but perhaps it should be
substring(./td[3]/a[1]/@href,14)
"Does not seem to work" is not a very clear description of what is wrong. Do you get error messages? Is the output wrong? Do you have any code surrounding the path expression you could show?
You can use substring, but using substring-after()
would be even better.
substring-after(/a/@href,'=')
assuming as input the tiny snippet you have shown:
<a href="page.php?var=12345"/>
will select
12345
and taking into account the structure of your input
substring-after(./td[3]/a[1]/@href,'=')
A leading .
in a path expression selects only immediate child td
nodes of the current context node. I trust you know what you are doing.
来源:https://stackoverflow.com/questions/29636372/how-to-use-substring-with-import-io