I\'ve created my own DefaultHandler to parse rss feeds and for most feeds it\'s working fine, however, for ESPN, it is cutting off part of the article url due to the way ESP
As you can see, it's cutting everything off the url from the ampersand escape code and after.
From the documentation of the characters()
method:
The Parser will call this method to report each chunk of character data. SAX parsers may return all contiguous character data in a single chunk, or they may split it into several chunks; however, all of the characters in any single event must come from the same external entity so that the Locator provides useful information.
When I write SAX parsers, I use a StringBuilder
to append everything passed to characters()
:
public void characters (char ch[], int start, int length) {
if (buf!=null) {
for (int i=start; i
Then in endElement()
, I take the contents of the StringBuilder
and do something with it. That way, if the parser calls characters()
several times, I don't miss anything.