How can I use PHP's various XML libraries to get DOM-like functionality and avoid DoS vulnerabilities, like Billion Laughs or Quadratic Blowup?

前端 未结 2 979
慢半拍i
慢半拍i 2020-12-14 04:34

I\'m writing a web application that has an XML API in PHP, and I\'m worried about three specific vulnerabilities, all related to inline DOCTYPE definitions: local file inclu

2条回答
  •  自闭症患者
    2020-12-14 05:01

    Note: If you create test-cases with files that contain the XML chunks in the following, expect that editors might be prone to these attacks as well and might freeze/crash.

    Billion laugh

    
    
      
      
      
      
      
      
      
      
      
    ]>
    &lol9;
    

    When loading:

    FATAL: #89: Detected an entity reference loop 1:7
    ... (plus six times the same = seven times total with above)
    FATAL: #89: Detected an entity reference loop 14:13

    Result:

    
    

    Memory usage is light, the peak not touched by DOMDocument. As this example shows 7 fatal errors, one can conclude and indeed it is so that this loads w/o errors:

    
    
      
      
    ]>
    &lol2;
    

    As entity substitution is not in effect and this work, let's try with

    Quadratic Blowup

    That is this one here, shortened for your viewing pleasure (my variants are about 27/11kb):

    
    
    ]>
    &a;&a;&a;&a;&a;&a;&a;&a;&a;...
    

    If you use $doc->loadXML($src, LIBXML_NOENT); this does work as an attack, while I write this, the script is still loading ... . So this actually takes some time to load and consumes memory. Something you can play with your own. W/o LIBXML_NOENT it works flawlessly and fast.

    But there is a caveat, if you obtain the nodeValue of a tag for example, you will get the entities expanded even if you don't use that loading flag.

    A workaround for this issue is to remove the DocumentType node from the document. Note the following code:

    $doc = new DOMDocument();
    $doc->loadXML($s); // where $s is a Quadratic attack xml string above.
    // now remove the doctype node
    foreach ($doc->childNodes as $child) {
        if ($child->nodeType===XML_DOCUMENT_TYPE_NODE) {
            $doc->removeChild($child);
            break;
        }
    }
    // Now the following is true:
    assert($doc->doctype===NULL);
    assert($doc->lastChild->nodeValue==='...');
    // Note that entities remain unexpanded in the output XML
    // This is not so good since this makes the XML invalid.
    // Better is a manual walk through all nodes looking for XML_ENTITY_NODE
    assert($doc->saveXML()==="\n&a;&a;&a;&a;&a;&a;&a;&a;&a;...\n");
    // however, canonicalization will produce warnings because it must resolve entities
    assert($doc->C14N()===False);
    // Warning will be like:
    //    PHP Warning:  DOMNode::C14N(): Node XML_ENTITY_REF_NODE is invalid here 
    

    So while this workaround will prevent an XML document from consuming resources in a DoS, it makes it easy to generate invalid XML.

    Some figures (I reduced the file-size otherwise it takes too long) (code):

    LIBXML_NOENT disabled                                          LIBXML_NOENT enabled
    
    Mem: 356 184 (Peak: 435 464)                                   Mem: 356 280 (Peak: 435 464)                             
    Loaded file quadratic-blowup-2.xml into string.                Loaded file quadratic-blowup-2.xml into string.          
    Mem: 368 400 (Peak: 435 464)                                   Mem: 368 496 (Peak: 435 464)                             
    DOMDocument loaded XML 11 881 bytes in 0.001368 secs.          DOMDocument loaded XML 11 881 bytes in 15.993627 secs.   
    Mem: 369 088 (Peak: 435 464)                                   Mem: 369 184 (Peak: 435 464)                             
    Removed load string.                                           Removed load string.                                     
    Mem: 357 112 (Peak: 435 464)                                   Mem: 357 208 (Peak: 435 464)                             
    Got XML (saveXML()), length: 11 880                            Got XML (saveXML()), length: 11 165 132                  
    Got Text (nodeValue), length: 11 160 314; 11.060893 secs.      Got Text (nodeValue), length: 11 160 314; 0.025360 secs. 
    Mem: 11 517 776 (Peak: 11 532 016)                             Mem: 11 517 872 (Peak: 22 685 360)                       
    

    I have not made up my mind so far about protection strategies but now know that loading the billion laugh into PHPStorm will freeze it for example and I stopped testing the later as I didn't wanted to freeze it while writing this.

提交回复
热议问题