可以将文章内容翻译成中文,广告屏蔽插件可能会导致该功能失效(如失效,请关闭广告屏蔽插件后再试):
问题:
Hello guys, I need to load an xml file into a data frame in R. The xml format is as shown below. How do I acheive the same?
<?xml version="1.0" encoding="utf-8"?><posts> <row Id="1" PostTypeId="1" AcceptedAnswerId="17" CreationDate="2010-07-26T19:14:18.907" Score="6"/></posts>
I tried the below code....It does not give the desired output. I am expecting a tabular output with the column names and their values listed below.
library(XML) xml.url ="test.xml" xmlfile = xmlTreeParse(xml.url) class(xmlfile) xmltop=xmlRoot(xmlfile) print(xmltop)[1:2] plantcat <- xmlSApply(xmltop, function(x) xmlSApply(x, xmlValue)) plantcat_df <- data.frame(t(plantcat))
回答1:
xml.text <- '<?xml version="1.0" encoding="utf-8"?> <posts> <row Id="1" PostTypeId="1" AcceptedAnswerId="17" CreationDate="2010-07-26T19:14:18.907" Score="6"/> <row Id="2" PostTypeId="1" AcceptedAnswerId="17" CreationDate="2010-07-26T19:14:18.907" Score="6"/> <row Id="3" PostTypeId="1" AcceptedAnswerId="17" CreationDate="2010-07-26T19:14:18.907" Score="6"/> <row Id="4" PostTypeId="1" AcceptedAnswerId="17" CreationDate="2010-07-26T19:14:18.907" Score="6"/> </posts>' library(XML) xml <- xmlParse(xml.text) result <- as.data.frame(t(xmlSApply(xml["/posts/row"],xmlAttrs)), stringsAsFactors=FALSE) # Id PostTypeId AcceptedAnswerId CreationDate Score # 1 1 1 17 2010-07-26T19:14:18.907 6 # 2 2 1 17 2010-07-26T19:14:18.907 6 # 3 3 1 17 2010-07-26T19:14:18.907 6 # 4 4 1 17 2010-07-26T19:14:18.907 6
This is a bit trickier than usual because the data is in attributes, not nodes (the nodes are empty), so we can't use xlmToDataFrame(...)
unfortunately.
All the data above is still character, so you still need to convert the columns to whatever class is appropriate.