This question is similar to a previous question, Import all fields (and subfields) of XML as dataframe, but I want to pull out only a subset of the XML data and want to incl
Assuming the XML data is in a file called world.xml read it in and iterate over the cities extracting the city name and the bname of any associated landmarks :
library(XML)
doc <- xmlParse("world.xml", useInternalNodes = TRUE)
do.call(rbind, xpathApply(doc, "/world/city", function(node) {
city <- xmlValue(node[["name"]])
xp <- "./buildings/building[./type/text()='landmark']/bname"
landmark <- xpathSApply(node, xp, xmlValue)
if (is.null(landmark)) landmark <- NA
data.frame(city, landmark, stringsAsFactors = FALSE)
}))
The result is:
city landmark
1 London Tower Bridge
2 New York
3 Paris Eiffel Tower
4 Paris Louvre