问题
So far I am doing something like this:
def is_utf8(s):
try:
x=bytes(s,'utf-8').decode('utf-8', 'strict')
print(x)
return 1
except:
return 0
the only problem is that I don't want it to print anything, I want to delete the print(x)
and when I do that, the function stops functioning correctly.
For example if I do : print(is_utf8("H�tst"))
while the print is in the function it returns 0 otherwise it prints 1. Am i approaching the problem in a wrong way
回答1:
You could use the chardet module to detect an unknown encoding. For example if a
is a byte array then you could determine the encoding like this:
import chardet
b = chardet.detect(a)
print(b["encoding"])
来源:https://stackoverflow.com/questions/49479913/how-to-check-if-a-string-contain-only-utf-8-characters