How would I parse the XML file in R and carry out basic Statistics Analysis on the data

后端 未结 2 1418
南方客
南方客 2020-12-15 01:55

I am trying to parse the XML file in R, so that I can analysis the data. I am trying to get the mean and standard deviation of the price. Also I would like to be able to get

相关标签:
2条回答
  • 2020-12-15 02:15

    In

    z <- strptime ("HH:MM:SS.ms, "%H:%m:%S.%f")
    

    you miss a closing " so it is invalid syntax.

    Next, the data is non-standard as we would use a dot for seconds.subseconds, ie 12:23:34.567 to denote a timestamp. The milliseconds can be parsed this way

    > ts <- "12:00:00.050"
    > strptime(ts, "%H:%M:%OS")
    [1] "2010-07-09 12:00:00 CDT"
    > 
    

    So you not only need to get it out of XML first, but also need to convert the string. Else, you can parse the string an fill a POSIXlt time structure 'by hand'.

    Postscriptum: Forgot to mention that you need to enable printing of sub-second times:

    > options("digits.secs"=3)         # shows milliseconds (three digits)
    > strptime(ts, "%H:%M:%OS")
    [1] "2010-07-09 12:00:00.05 CDT"   # suppresses trailing zero
    > 
    

    Postscriptum 2: You are also in luck with respect to your file thanks to the XML package:

    > library(XML)
    > xmlToDataFrame("c:/Temp/foo.xml")     # save your data as c:/Temp/foo.xml
          timeStamp   Price
    1   12:00:00:01   25.02
    2   12:00:00:02      15
    3  12:00:00:025   15.02
    4  12:00:00:031   18.25
    5  12:00:00:039   18.54
    6  12:00:00:050   16.52
    7   12:00:01:01   17.50
    > 
    
    0 讨论(0)
  • 2020-12-15 02:30

    For more complex XML data, it might useful to use the XML package.

    library(XML)
    
    check <- xmlInternalTreeParse("/PathToXMLFile/checkXML.xml")
    xpathSApply(check, "//timeStamp", xmlValue)
    ## [1] " 12:00:00:01"  " 12:00:00:02"  " 12:00:00:025" " 12:00:00:031"
    ## [5] " 12:00:00:039" " 12:00:00:050" " 12:00:01:01" 
    
    0 讨论(0)
提交回复
热议问题