Unable to parse xml data with colon (:) from response using getNamespaces()

强颜欢笑 提交于 2019-11-29 15:09:11
hakre

I have no problem to get it to work, the only error I could find is that you're loading XML containing a non-XML HTML chunk in there which is breaking the document: The meta elements in the head section are not closed.

See Demo.

Tip: Always activate error logging and reporting, check for warnings and notices if you develop and debug code. A short one-line displaying all sort of PHP error messages incl. warnings, notices and strict:

error_reporting(-1); ini_set('display_errors', 1);

DOMDocument is talkative then about malformed elements when loading XML.

Fixing the XML "on the fly"

DomDocument accepts only valid XML. If you've got HTML you can alternatively try if DOMDocument::loadHTML() does the job as well, however it will convert the loaded string into a X(HT)ML document then. Probably not what you're looking for.

To escape a specific part of the string to load to make it XML compatible you can search for string patterns to obtain the substring that represents the HTML inside the XML and properly XML encode it.

E.g. you can look for <html> and </html> as the surrounding tags, extract the substring of the whole and replace it with substr_replace(). To encode the HTML for being used as data inside the XML, use the htmlspecialchars() function, it will replace everything with the five entities in the other SO answer.

Some mock-up code:

$htmlStart = strpos($xml, '<html>');
if (false === $htmlStart) throw new Exception('<html> not found.');
$htmlEnd = strpos($xml, '</html>', $htmlStart);
if (false === $htmlStart) throw new Exception('</html> not found.');
$htmlLen = $htmlEnd - $htmlStart + 7;
$htmlString = substr($xml, $htmlStart, $htmlLen);
$htmlEscaped = htmlspecialchars($htmlString, ENT_QUOTES);
$xml = substr_replace($xml, $htmlEscaped, $htmlStart, $htmlLen);
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!