How to convert an portion of an XML into a data frame? (properly)

前端 未结 2 1737
后悔当初
后悔当初 2021-01-01 03:44

I am trying to extract information from an XML file from ClinicalTrials.gov. The file is organized in the following way:


  ...
  

        
2条回答
  •  失恋的感觉
    2021-01-01 04:17

    This answer converts the XML to a list, unlists each location section, transposes the section, converts the section to a data.table, and then uses rbindlist to merge all of the individual locations into one table. The fill=T argument matches the elements by name, and fills in missing element values with NA.

    library(XML); library(data.table)
    
    clinicalTrialUrl <- "http://clinicaltrials.gov/ct2/show/NCT01480479?resultsxml=true"
    xmlDoc <- xmlParse(clinicalTrialUrl, useInternalNode=TRUE)
    
    xmlToDT <- function(doc, path) {
      rbindlist(
        lapply(getNodeSet(doc, path),
               function(x) data.table(t(unlist(xmlToList(x))))
        ), fill=T)
    }
    
    locationDT <- xmlToDT(xmlDoc, "//location")
    locationDT[1:6]
    ##                                                                       facility.name facility.address.city facility.address.state facility.address.zip
    ## 1:                                                                "HYGEIA" Hospital               Marousi     District of Attica               151 23
    ## 2: Allina Health, Abbott Northwestern Hospital, John Nasseff Neuroscience Institute           Minneapolis              Minnesota                55407
    ## 3:                  Amrita Institute of Medical Sciences and Research Centre, Kochi                 Kochi                 Kerala              682 026
    ## 4:                                                      Anne Arundel Medical Center             Annapolis               Maryland                21401
    ## 5:                                                              Atlanta Cancer Care               Atlanta                Georgia                30005
    ## 6:                                                                    Austin Health            Heidelberg               Victoria                 3084
    ##    facility.address.country
    ## 1:                   Greece
    ## 2:            United States
    ## 3:                    India
    ## 4:            United States
    ## 5:            United States
    ## 6:                Australia
    

提交回复
热议问题