How to use substring() with Import.io?

守給你的承諾、 提交于 2019-12-22 06:56:21

问题


I'm having some issues with XPath and import.io and I hope you'll be able to help me. :)

The html code:

<a href="page.php?var=12345">

For the moment, I manage to extract the content of the href ( page.php?var=12345 ) with this:

./td[3]/a[1]/@href

Though, I would like to just collect: 12345

substring might be the solution but it does not seem to work on import.io as I use it...

substring(./td[3]/a[1]/@href,13)

Any ideas of what the problem is?

Thank's a lot in advance!


回答1:


Try using this for the xpath: (Have the field selected as Text)

.//*[@class='oeil']/a/@href

Then use this for your regex:

([^=]*)$

This will get you the ISBN number you are looking for.

import.io only support functions in XPath when they return a node list




回答2:


Your path expression is fine, but perhaps it should be

substring(./td[3]/a[1]/@href,14)

"Does not seem to work" is not a very clear description of what is wrong. Do you get error messages? Is the output wrong? Do you have any code surrounding the path expression you could show?


You can use substring, but using substring-after() would be even better.

substring-after(/a/@href,'=')

assuming as input the tiny snippet you have shown:

<a href="page.php?var=12345"/>

will select

12345

and taking into account the structure of your input

substring-after(./td[3]/a[1]/@href,'=')

A leading . in a path expression selects only immediate child td nodes of the current context node. I trust you know what you are doing.



来源:https://stackoverflow.com/questions/29636372/how-to-use-substring-with-import-io

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!