PHP DOMDocument loadHTML not encoding UTF-8 correctly

后端 未结 13 1716
梦如初夏
梦如初夏 2020-11-22 15:11

I\'m trying to parse some HTML using DOMDocument, but when I do, I suddenly lose my encoding (at least that is how it appears to me).

$profile = \"

        
13条回答
  •  野趣味
    野趣味 (楼主)
    2020-11-22 15:46

    Use it for correct result

    $dom = new DOMDocument();
    $dom->loadHTML('' . $profile);
    echo $dom->saveHTML();
    echo $profile;
    

    This operation

    mb_convert_encoding($profile, 'HTML-ENTITIES', 'UTF-8');
    

    It is bad way, because special symbols like < ; , > ; can be in $profile, and they will not convert twice after mb_convert_encoding. It is the hole for XSS and incorrect HTML.

提交回复
热议问题