How to use substring() with Import.io?

I'm having some issues with XPath and import.io and I hope you'll be able to help me. :)

The html code:

<a href="page.php?var=12345">

For the moment, I manage to extract the content of the href ( page.php?var=12345 ) with this:

./td[3]/a[1]/@href

Though, I would like to just collect: 12345

substring might be the solution but it does not seem to work on import.io as I use it...

substring(./td[3]/a[1]/@href,13)

Any ideas of what the problem is?

Thank's a lot in advance!

Try using this for the xpath: (Have the field selected as Text)

.//*[@class='oeil']/a/@href

Then use this for your regex:

([^=]*)$

This will get you the ISBN number you are looking for.

import.io only support functions in XPath when they return a node list

Your path expression is fine, but perhaps it should be

substring(./td[3]/a[1]/@href,14)

"Does not seem to work" is not a very clear description of what is wrong. Do you get error messages? Is the output wrong? Do you have any code surrounding the path expression you could show?

You can use substring, but using substring-after() would be even better.

substring-after(/a/@href,'=')

assuming as input the tiny snippet you have shown:

<a href="page.php?var=12345"/>

will select

and taking into account the structure of your input

substring-after(./td[3]/a[1]/@href,'=')

A leading . in a path expression selects only immediate child td nodes of the current context node. I trust you know what you are doing.

来源：https://stackoverflow.com/questions/29636372/how-to-use-substring-with-import-io

标签

xpath

import.io