Unescape HTML entities containing newline in Javascript?

≯℡__Kan透↙ 提交于 2019-12-05 14:52:36

The most simple, but probably not the most efficient solution is to have htmlDecode() act only on character and entity references:

var s = "foo\n&\nbar";
s = s.replace(/(&[^;]+;)+/g, htmlDecode);

More efficient is using an optimized rewrite of htmlDecode() that is only called once per input, acts only on character and entity references, and reuses the DOM element object:

function htmlDecode (input)
{
  var e = document.createElement("span");

  var result = input.replace(/(&[^;]+;)+/g, function (match) {
    e.innerHTML = match;
    return e.firstChild.nodeValue;
  });

  return result;
}

/* returns "foo\n&\nbar" */
htmlDecode("foo\n&\nbar");

Wladimir Palant has pointed out an XSS issue with this function: The value of some (HTML5) event listener attributes, like onerror, is executed if you assign HTML with elements that have those attributes specified to the innerHTML property. So you should not use this function on arbitrary input containing actual HTML, only on HTML that is already escaped. Otherwise you should adapt the regular expression accordingly, for example use /(&[^;<>]+;)+/ instead to prevent &…; where contains tags from being matched.

For arbitrary HTML, please see his alternative approach, but note that it is not as compatible as this one.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!