How do I escape closing '/' in HTML tags in JSON with Python?

此生再无相见时 提交于 2021-01-22 08:02:13

问题


Note: This question is very close to Embedding JSON objects in script tags, but the responses to that question provides what I already know (that in JSON / == \/). I want to know how to do that escaping.

The HTML spec prohibits closed HTML tags anywhere within a <script> element. So, this causes parse errors:

<script>
var assets = [{
  "asset_created": null, 
  "asset_id": "575155948f7d4c4ebccb02d4e8f84d2f", 
  "body": "<script></script>"
}];
</script>

In my case, I'm generating the invalid situation by rendering a JSON string inside a Django template, i.e.:

<script>
var assets = {{ json_string }};
</script>

I know that JSON parses \/ the same as /, so if I can just escape my closing HTML tags in the JSON string, I'll be good. But, I'm not sure of the best way to do this.

My naive approach would just be this:

json_string = '[{"asset_created": null, "asset_id": "575155948f7d4c4ebccb02d4e8f84d2f", "body": "<script></script>"}]'
escaped_json_string = json_string.replace('</', r'<\/')

Is there a better way? Or any gotchas that I'm overlooking?


回答1:


Updated Answer

Okay I assumed a few things incorrectly. For escaping the JSON, the simplejson library has a method JSONEncoderForHTML than can be used. You may need to install it via pip or easy_install if the code doesn't work. Then you can do something like this:

import simplejson
asset_json=simplejson.loads(json_string)
encoded=simplejson.encoder.JSONEncoderForHTML().encode(assets_json)

which encoded will give you this:

'{"asset_id": "575155948f7d4c4ebccb02d4e8f84d2f", "body": "\\u003cscript\\u003e\\u003c/script\\u003e", "asset_created": null}'

This is a more overall solution than the slash replace as it handles other encoding caveats as well.

The loads part is a side-effect of having the JSON already encoded. This can be avoided by not using DJango if possible to generate the JSON and instead using simplejson:

simplejson.dumps(your_object_to_encode, cls=simplejson.encoder.JSONEncoderForHTML)

Old Answer

Try wrapping your script in CDATA:

<script>
//<![CDATA[
var assets = [{
  "asset_created": null, 
  "asset_id": "575155948f7d4c4ebccb02d4e8f84d2f", 
  "body": "<script></script>"
}];
//]]>
</script>

It's meant to flag the parser on this sort of thing. Otherwise you'll need to use the character escapes that have been mentioned.



来源:https://stackoverflow.com/questions/15297028/how-do-i-escape-closing-in-html-tags-in-json-with-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!