问题
I'm working on a rss parser in android (upgrading a parser I found on the internet). From what I know SAX Parser recognize the encoding automatically from the xml tag, but when I try to parse a feed that declare windows-1255 encoding it doesn't parsing it and throws and exception. I tried few things:
final InputSource source = new InputSource(feed); Reader isr = new InputStreamReader(feed); source.setCharacterStream(isr);
I even tried telling him the specific encoding.
source.setEncoding("Windows-1255");
Tried to look at the locator:
@Override public void setDocumentLocator(Locator locator) { }
And it recognize the encoding as UTF-16.
Please help me solve this annoying problem! Sorry for the mess with code snippets the code button refuse to work for some reason.
回答1:
Chances are the platform itself doesn't know about the "windows-1255" encoding. After all, it's a Windows-based encoding - I wouldn't want to rely on it being available on any other platforms, particularly mobile ones where things are generally cut down to the "must-have" options.
回答2:
You need to set the encoding to the InputStreamReader.
Reader isr = new InputStreamReader(feed, "windows-1255");
final InputSource source = new InputSource(isr);
From javadoc the logic for reading from InputSource goes something like this:
- Is there a character stream? if there is, use that(This is what happens if you use a Reader like InputStreamReader)
Otherwise:
- No character stream? Use byte stream. (InputStream)
- Is there a encoding set for InputSource? Use that
- There was no encoding set? Try parsing the encoding from the xml file
来源:https://stackoverflow.com/questions/9931024/sax-parser-doesnt-recognize-windows-1255-encoding