PHP SimpleXML get innerXML

痞子三分冷 提交于 2019-11-26 15:31:25

To the best of my knowledge, there is not built-in way to get that. I'd recommend trying SimpleDOM, which is a PHP class extending SimpleXMLElement that offers convenience methods for most of the common problems.

include 'SimpleDOM.php';

$qa = simpledom_load_string(
    '<qa>
       <question>Who are you?</question>
       <answer>Who who, <strong>who who</strong>, <em>me</em></answer>
    </qa>'
);
echo $qa->answer->innerXML();

Otherwise, I see two ways of doing that. The first would be to convert your SimpleXMLElement to a DOMNode then loop over its childNodes to build the XML. The other would be to call asXML() then use string functions to remove the root node. Attention though, asXML() may sometimes return markup that is actually outside of the node it was called from, such as XML prolog or Processing Instructions.

function SimpleXMLElement_innerXML($xml)
  {
    $innerXML= '';
    foreach (dom_import_simplexml($xml)->childNodes as $child)
    {
        $innerXML .= $child->ownerDocument->saveXML( $child );
    }
    return $innerXML;
  };

This works (although it seems really lame):

echo (string)$qa->answer;
Frederic Bazin

most straightforward solution is to implement custom get innerXML with simple XML:

function simplexml_innerXML($node)
{
    $content="";
    foreach($node->children() as $child)
        $content .= $child->asXml();
    return $content;
}

In your code, replace $body_content = $el->asXml(); with $body_content = simplexml_innerXML($el);

However, you could also switch to another API that offers distinction between innerXML (what you are looking for) and outerXML (what you get for now). Microsoft Dom libary offers this distinction but unfortunately PHP DOM doesn't.

I found that PHP XMLReader API offers this distintion. See readInnerXML(). Though this API has quite a different approach to processing XML. Try it.

Finally, I would stress that XML is not meant to extract data as subtrees but rather as value. That's why you running into trouble finding the right API. It would be more 'standard' to store HTML subtree as a value (and escape all tags) rather than XML subtree. Also beware that some HTML synthax are not always XML compatible ( i.e.
vs ,
). Anyway in practice, you approach is definitely more convenient for editing the xml file.

I would have extend the SimpleXmlElement class:

class MyXmlElement extends SimpleXMLElement{

    final public function innerXML(){
        $tag = $this->getName();
        $value = $this->__toString();
        if('' === $value){
            return null;
        }
        return preg_replace('!<'. $tag .'(?:[^>]*)>(.*)</'. $tag .'>!Ums', '$1', $this->asXml());
    }
}

and then use it like this:

echo $qa->answer->innerXML();
user1240602
<?php
    function getInnerXml($xml_text) {           
        //strip the first element
        //check if the strip tag is empty also
        $xml_text = trim($xml_text);
        $s1 = strpos($xml_text,">");        
        $s2 = trim(substr($xml_text,0,$s1)); //get the head with ">" and trim (note that string is indexed from 0)

        if ($s2[strlen($s2)-1]=="/") //tag is empty
            return "";

        $s3 = strrpos($xml_text,"<"); //get last closing "<"        
        return substr($xml_text,$s1+1,$s3-$s1-1);
    }

    var_dump(getInnerXml("<xml />"));
    var_dump(getInnerXml("<xml  /  >faf <  / xml>"));
    var_dump(getInnerXml("<xml      ><  / xml>"));    
    var_dump(getInnerXml("<xml>faf <  / xml>"));
    var_dump(getInnerXml("<xml  >  faf <  / xml>"));      
?>

After I search for a while, I got no satisfy solution. So I wrote my own function. This function will get exact the innerXml content (including white-space, of course). To use it, pass the result of the function asXML(), like this getInnerXml($e->asXML()). This function work for elements with many prefixes as well (as my case, as I could not find any current methods that do conversion on all child node of different prefixes).

Output:

string '' (length=0)    
string '' (length=0)    
string '' (length=0)    
string 'faf ' (length=4)    
string '  faf ' (length=6)
    function get_inner_xml(SimpleXMLElement $SimpleXMLElement)
    {
        $element_name = $SimpleXMLElement->getName();
        $inner_xml = $SimpleXMLElement->asXML();
        $inner_xml = str_replace('<'.$element_name.'>', '', $inner_xml);
        $inner_xml = str_replace('</'.$element_name.'>', '', $inner_xml);
        $inner_xml = trim($inner_xml);
        return $inner_xml;
    }

If you don't want to strip CDATA section, comment out lines 6-8.

function innerXML($i){
    $text=$i->asXML();
    $sp=strpos($text,">");
    $ep=strrpos($text,"<");
    $text=trim(($sp!==false && $sp<=$ep)?substr($text,$sp+1,$ep-$sp-1):'');
    $sp=strpos($text,'<![CDATA[');
    $ep=strrpos($text,"]]>");
    $text=trim(($sp==0 && $ep==strlen($text)-3)?substr($text,$sp+9,-3):$text);
    return($text);
}

You can just use this function :)

function innerXML( $node )
{
    $name = $node->getName();
    return preg_replace( '/((<'.$name.'[^>]*>)|(<\/'.$name.'>))/UD', "", $node->asXML() );
}

using regex you could do this

preg_match(’/<answer(.*)?>(.*)?<\/answer>/’, $xml, $match);
$result=$match[0];
print_r($result);
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!