Extract second attribute of a xml node in R (XML package)

只愿长相守 提交于 2019-12-14 00:25:32

问题


I want to extract both 'lat' and 'long' from a .xml file like this:

<asdf>
<dataset>
    <px lon="-55.75" lat="-18.5">2.186213</px>
    <px lon="-50.0"  lat="-18.5">0.0</px>
    <px lon="-66.75" lat="-03.0">1.68412</px>
    </dataset>
</asdf>

this is what I've done so far, using the R::XML package:

#Load library for xml loading reading extracting
library(XML)

#Parse xml file 
a3  <- xmlRoot(xmlTreeParse("my_file.xml"))

#Extract text-value and attributes as lists
precip <- xmlSApply(a3, function(x) xmlSApply(x, xmlValue))
long   <- xmlSApply(a3, function(x) xmlSApply(x, xmlAttrs))
lat    <- xmlSApply(a3, function(x) xmlSApply(x, xmlAttrs)) #???

dt.lat.long.val <- data.frame(as.numeric(as.vector(lat)), 
                          as.numeric(as.vector(long)), 
                          as.numeric(as.vector(precip)))

How do I edit the line ending in #??? so to get the lat values?


回答1:


You can extract the data using something along these lines

test <- '<asdf>
<dataset>
    <px lon="-55.75" lat="-18.5">2.186213</px>
    <px lon="-50.0"  lat="-18.5">0.0</px>
    <px lon="-66.75" lat="-03.0">1.68412</px>
    </dataset>
</asdf>'

library(XML)
a3 <- xmlParse(test)

out <- xpathApply(a3, "//px", function(x){
  coords <- xmlAttrs(x)
  data.frame(precip = xmlValue(x), lon = coords[1], lat = coords[2], stringsAsFactors = FALSE)
})

> do.call(rbind.data.frame, out)
       precip    lon   lat
lon  2.186213 -55.75 -18.5
lon1      0.0  -50.0 -18.5
lon2  1.68412 -66.75 -03.0


来源:https://stackoverflow.com/questions/23567988/extract-second-attribute-of-a-xml-node-in-r-xml-package

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!