问题
I have a hex string and i want to convert it utf8 to insert mysql. (my database is utf8)
hex_string = 'kitap ara\xfet\xfdrmas\xfd'
...
result = 'kitap araştırması'
How can I do that?
回答1:
Assuming Python 2.6,
>>> print('kitap ara\xfet\xfdrmas\xfd'.decode('iso-8859-9'))
kitap araştırması
>>> 'kitap ara\xfet\xfdrmas\xfd'.decode('iso-8859-9').encode('utf-8')
'kitap ara\xc5\x9ft\xc4\xb1rmas\xc4\xb1'
回答2:
Try(Python 3.x):
import codecs
codecs.decode("707974686f6e2d666f72756d2e696f", "hex").decode('utf-8')
From here.
回答3:
Try
hex_string.decode("cp1254").encode("utf-8")
(cp1254
or iso-8859-9
are the Turkish codepages, the former being the usual name on Windows platforms, but in Python, both work equally well)
回答4:
First you need to decode it from the encoded bytes you have. That appears to be ISO-8859-9 (latin-5), or, if you are using Windows, probably code page 1254, which is based on latin-5.
>>> 'kitap ara\xfet\xfdrmas\xfd'.decode('cp1254')
u'kitap ara\u015ft\u0131rmas\u0131' # u'kitap araştırması'
If you are using Windows, then depending on where you are getting those bytes, it might be more appropriate to decode them as mbcs
, which translates to ‘whichever code page the local system is using’. If the string is just sitting in a .py
file, you would be better off just writing u'kitap araştırması'
in the source and setting a -*- coding
declaration to direct Python to decode it. See PEP 263.
As to how to encode unicode strings to UTF-8 for the database, well, if you want to you can do it manually:
>>> u'kitap ara\u015ft\u0131rmas\u0131'.encode('utf-8')
'kitap ara\xc5\x9ft\xc4\xb1rmas\xc4\xb1'
but a good data access layer is likely to do that automatically for you, if you've got the COLLATION
of the tables the data is going into right.
回答5:
String literals explains how to use UTF8 strings in Python source.
来源:https://stackoverflow.com/questions/3045876/decode-string-with-hex-characters-in-python-2