问题
I make a call to google's dictionary api like this:
var json = new WebClient().DownloadString(string.Format(@"http://www.google.com/dictionary/json?callback=dict_api.callbacks.id100&q={0}&sl=en&tl=en", "bar"));
However I get a response that this code fails to parse correctly:
json = json.Replace("dict_api.callbacks.id100(", "").Replace(",200,null)", "");
JObject o = JObject.Parse(json);
The parse dies at encountering this:
"entries":[{"type":"example","terms":[{"type":"text","text":"\x3cem\x3ebars\x3c/em\x3e of sunlight shafting through the broken windows","language":"en"}]}]}
The
\x3cem\x3ebars\x
stuff kills the parse
Is there some way to handle this JSONP response with JSON.NET?
The answer by aquinas to another "Parse JSONP" question shows nice regex x = Regex.Replace(x, @"^.+?\(|\)$", "");
to handle with JSONP part (may need to tweak regex for this case), so main part here is how to deal with hex-encoded characters.
回答1:
Reference: How to decode HTML encoded character embedded in a json string
JSON specs for strings do not allow hexadecimal ASCII escape-sequences, but only Unicode escape-sequences, which is why the escape sequence is unrecognized and which is why using \u0027 instead should work ... now you could blindly replace \x with \u00 (this should perfectly work on valid JSON, although some comments may get damaged in theory, but who cares ... :D)
So change your code to this will fix it:
var json = new WebClient().DownloadString(string.Format(@"http://www.google.com/dictionary/json?callback=dict_api.callbacks.id100&q={0}&sl=en&tl=en", "bar"));
json = json
.Replace("dict_api.callbacks.id100(", "")
.Replace(",200,null)", "")
.Replace("\\x","\\u00");
JObject o = JObject.Parse(json);
回答2:
The server is not returning valid JSON: JSON does not support \xAB
character escape sequences, only \uABCD
escapes sequences.
The "solutions" I have seen execute a text-replace on the string first. Here is one of my replies to a similar questions for Java. Note the regular expression inputString.replaceAll("\\x(\d{2})", "\\u00$1")
at the bottom; adapt to language.
来源:https://stackoverflow.com/questions/12362456/how-to-parse-malformed-jsonp-with-hex-encoded-characters-using-json-net