How can I get unicode characters from a URL parameter?

时光毁灭记忆、已成空白 提交于 2019-12-05 03:28:45

问题


I need to use a GET request to send JSON to my server via a JavaScript client, so I started echoing responses back to make sure nothing is lost in translation. There doesn't seem to be a problem with normal text, but as soon as I include a Unicode character of any sort (e.g. "ç") the character is encoded somehow (e.g. "\u00e7") and the return value is different from request value. My primary concern is that, A) In my Python code saves what the client intended on sending to the database correctly, and B) I echo the same values back to the client that were sent (when testing).

Perhaps this means I can't use base64, or have to do something different along the way. I'm ok with that. My implementation is just an attempt at a means to an end.

Current steps (any step can be changed, if needed):

Raw JSON string which I want to send to the server:

'{"weird-chars": "°ç"}'

JavaScript Base64 encoded version of the string passed to server via GET param (on a side note, will the equals sign at the end of the encoded string cause any issues?):

http://www.myserver.com/?json=eyJ3ZWlyZC1jaGFycyI6ICLCsMOnIn0=

Python str result from b64decode of param:

'{"weird-chars": "\xc2\xb0\xc3\xa7"}'

Python dict from json.loads of decoded param:

{'weird-chars': u'\xb0\xe7'}

Python str from json.dumps of that dict (and subsequent output to the browser):

'{"weird-chars": "\u00b0\u00e7"}'

回答1:


Everything looks fine to me.

>>> hex(ord(u'°'))
'0xb0'
>>> hex(ord(u'ç'))
'0xe7'

Perhaps you should decode the JSON before attempting to use it.




回答2:


Your procedure's fine, you just need 1 more step; that is, encoding from unicode to utf-8 (or any other encoding that supports the 'weird characters'.)

Think of decoding as what you do to go from a regular string to unicode and encoding as what you do to get back from unicode. In other words:

You de - code a str to produce a unicode string

and en - code a unicode string to produce an str.

So:

params = {'weird-chars': u'\xb0\xe7'}

encodedchars = params['weird-chars'].encode('utf-8')

encodedchars will contain your characters, displayed in the selected encoding (in this case, utf-8).



来源:https://stackoverflow.com/questions/4474430/how-can-i-get-unicode-characters-from-a-url-parameter

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!