Is htmlentities() sufficient for creating xml-safe values?

后端 未结 5 1020
礼貌的吻别
礼貌的吻别 2020-11-29 03:40

I\'m building an XML file from scratch and need to know if htmlentities() converts every character that could potentially break an XML file (and possibly UTF-8 data)?

5条回答
  •  被撕碎了的回忆
    2020-11-29 04:42

    Thought I'd add this for those who need to sanitize & not lose the XML attributes.

    // Returns SimpleXML Safe XML keeping the elements attributes as well
    function sanitizeXML($xml_content, $xml_followdepth=true){
    
        if (preg_match_all('%<((\w+)\s?.*?)>(.+?)%si', $xml_content, $xmlElements, PREG_SET_ORDER)) {
    
            $xmlSafeContent = '';
    
            foreach($xmlElements as $xmlElem){
                $xmlSafeContent .= '<'.$xmlElem['1'].'>';
                if (preg_match('%<((\w+)\s?.*?)>(.+?)%si', $xmlElem['3'])) {
                    $xmlSafeContent .= sanitizeXML($xmlElem['3'], false);
                }else{
                    $xmlSafeContent .= htmlspecialchars($xmlElem['3'],ENT_NOQUOTES);
                }
                $xmlSafeContent .= '';
            }
    
            if(!$xml_followdepth)
                return $xmlSafeContent;
            else
                return "".$xmlSafeContent;
    
        } else {
            return htmlspecialchars($xml_content,ENT_NOQUOTES);
        }
    
    }
    

    Usage:

    $body = <<
    
       
          2016 & Au Rendez-Vous Des Enfoir&
       
    
    EG;
    $newXml = sanitizeXML($body);
    var_dump($newXml);
    

    Returns:

    
    
        
            2016 & Au Rendez-Vous Des Enfoir&
        
    
    

提交回复
热议问题