R dataframe from XML when values are multiple or missing

前端 未结 4 1261
盖世英雄少女心
盖世英雄少女心 2020-12-16 08:40

This question is similar to a previous question, Import all fields (and subfields) of XML as dataframe, but I want to pull out only a subset of the XML data and want to incl

4条回答
  •  没有蜡笔的小新
    2020-12-16 09:15

    Assuming the XML data is in a file called world.xml read it in and iterate over the cities extracting the city name and the bname of any associated landmarks :

    library(XML)
    doc <- xmlParse("world.xml", useInternalNodes = TRUE)
    
    do.call(rbind, xpathApply(doc, "/world/city", function(node) {
    
       city <- xmlValue(node[["name"]])
    
       xp <- "./buildings/building[./type/text()='landmark']/bname"
       landmark <- xpathSApply(node, xp, xmlValue)
       if (is.null(landmark)) landmark <- NA
    
       data.frame(city, landmark, stringsAsFactors = FALSE)
    
    }))
    

    The result is:

          city     landmark
    1   London Tower Bridge
    2 New York         
    3    Paris Eiffel Tower
    4    Paris       Louvre
    

提交回复
热议问题