I have a dictionary of dictionaries in Python:
d = {\"a11y_firesafety.html\":{\"lang:hi\": {\"div1\": \"http://a11y.in/a11y/idea/a11y_firesafety.html:hi\"},
You could also use this:
import fileinput
fout = open("out.txt", 'a')
for i in fileinput.input("in.txt"):
str = i.replace("u\"","\"").replace("u\'","\'")
print >> fout,str
The typical json responses from standard websites have these two encoding representations - u' and u" This snippet gets rid of both of them. It may not be required as this encoding doesn't hinder any logical processing, as mentioned by previous commenter
Why do you care about the 'u' characters? They're just a visual indicator; unless you're actually using the result of str(temp)
in your code, they have no effect on your code. For example:
>>> test = u"abcd"
>>> test == "abcd"
True
If they do matter for some reason, and you don't care about consequences like not being able to use this code in an international setting, then you could pass in a custom object_hook
(see the json docs here) to produce dictionaries with string contents rather than unicode.
There is no "unicode" encoding, since unicode is a different data type and I don't really see any reason unicode would be a problem, since you may always convert it to string doing e.g. foo.encode('utf-8')
.
However, if you really want to have string objects upfront you should probably create your own decoder class and use it while decoding JSON.