Google spreadsheet importxml : how to grab all names of element nodes in XML

谁都会走 提交于 2020-03-06 03:08:02

问题


I'm trying to use importxml function to import XML.

<item>
    <name>James</name>
    <date>11/11/2016</date>
    <description>Student</description>
</item>

If I use,

=importxml(URL, "//item")

I can import the information, but not the names of each information.

I'd like to pull something like this

name      date       description
James     11/11/2016 Student

Any xPath function to do this?


回答1:


You can get the headers with this formula:

=unique(arrayformula(regexreplace(transpose(split(IMPORTDATA(A1),"><",false)),">.*|\/","")))

Basically what I do, is use importdata to pull everything on the page, then using split and transpose functions, I force it to split based on each nested item >< , transpose is to swap it vertically.

At that point this is what you would see:

Then using regexreplace with arrayformula I remove all the data after the headers with ">.*|\/" and then use unique to give me a final unique list of all headers.



来源:https://stackoverflow.com/questions/40562451/google-spreadsheet-importxml-how-to-grab-all-names-of-element-nodes-in-xml

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!