In R XML Package, what is the difference between xmlParse and xmlTreeParse?

前端 未结 1 860
情深已故
情深已故 2020-12-28 19:44

When would I want to use the xmlParse function versus the xmlTreeParse function? Also, when are parameter values useInternalNodes=TRUE

相关标签:
1条回答
  • 2020-12-28 20:21

    Here some feedback after using XML package.

    • xmlParse is a version of xmlTreeParse where argument useInternalNodes is set to TRUE.
    • If you want to get an R object use xmlTreeParse. This can be not very efficient and unnecessary if you want just to extract partial part of the xml document.
    • If you don't want to get an R object, just a c pointer, use xmlParse. But you should know some xpath bases to manipulate the result.
    • Use asText=TRUE if you have a text not a file or an url as input.

    Here an example where I show the difference between the 2 functions:

    txt <- "<doc>
              <el> aa </el>
           </doc>"
    library(XML)
    res <- xmlParse(txt,asText=TRUE)
    res.tree <- xmlTreeParse(txt,asText=TRUE)
    

    Now inspecting the 2 objects:

    class(res)
    [1] "XMLInternalDocument" "XMLAbstractDocument"
    > class(res.tree)
    [1] "XMLDocument"         "XMLAbstractDocument"
    

    You see that res is an internal document. It is pointer to a C object. res.tree is an R object. You can get its attributes like this :

     res.tree$doc$children
    $doc
    <doc>
     <el>aa</el>
    </doc>
    

    For res, you should use a valid xpath request and one of theses functions ( xpathApply, xpathSApply ,getNodeSet) to inspect it. for example:

    xpathApply(res,'//el')
    

    Once you create a valid Xml Node , you can apply xmlValue, xmlGetAttr,..to extract node information. So here this 2 statements are equivalent:

    ## we have already an R object, just apply xmlValue to the right child
    xmlValue(res.tree$doc$children$doc)
    ## xpathSApply create an R object and pass it to
    xpathSApply(res,'//el',xmlValue)    
    
    0 讨论(0)
提交回复
热议问题