For a homework assignment I am attempting to convert an XML file into a data frame in R. I have tried many different things, and I have searched for ideas on the internet bu
For me the canonical answer is
doc<-xmlParse("Olive_py.xml")
xmldf <- xmlToDataFrame(nodes = getNodeSet(doc, "//record"))
which is somehow hidden in @Parfait's answer.
However, this will fail if some of the nodes have multiple child nodes of the same type. In such cases an extractor function will solve the problem:
example data
code
require(XML)
require(tidyr)
require(dplyr)
node2df <- function(node){
# (Optinonally) read out properties of some optional child node
outputNodes = getNodeSet(node, "output")
stdout = if (length(outputNodes) > 0) xmlValue(outputNodes[[1]]) else NA
vec_as_df <- function(namedVec, row_name="name", value_name="value"){
data_frame(name = names(namedVec), value = namedVec) %>% set_names(row_name, value_name)
}
# Extract all node properties
node %>%
xmlAttrs %>%
vec_as_df %>%
pivot_wider(names_from = name, values_from = value) %>%
mutate(stdout = stdout)
}
testResults = xmlParse(xmlFile) %>%
getNodeSet("/testrun/suite/test", fun = node2df) %>%
bind_rows()