This question is similar to a previous question, Import all fields (and subfields) of XML as dataframe, but I want to pull out only a subset of the XML data and want to incl
If you're looking to exactly reproduce the desired output you showed in your question, you can convert your XML to a list and then extract the information you want:
xml_list <- xmlToList(xmlParse(xml_data))
First loop through each "building" node and remove those that contain "station":
xml_list <- lapply(xml_list, lapply, function(x) {
x[!sapply(x, function(y) any(y == "station"))]
})
Then collect data for each city into a data frame
xml_list <- lapply(xml_list, function(x) {
bldgs <- unlist(x$buildings)
bldgs <- bldgs[bldgs != "landmark"]
if(is.null(bldgs)) bldgs <- NA
data.frame(
"city" = x$name,
"landmark" = bldgs,
stringsAsFactors = FALSE)
})
Then combine information from all cities together:
xml_output <- do.call("rbind", xml_list)
xml_output
city landmark
city London Tower Bridge
city1 New York
city.1 Paris Eiffel Tower
city.2 Paris Louvre