问题
I am attempting to use @XmlAnyElement with DomHandler to capture the unparsed text within a particular field like in this example from Blaise Doughan. But when I attempt to parse multiple customers the contents of bio fields from all previous records continue to be sent to my DomHandler!
Here is the example document I am trying to parse:
<?xml version="1.0" encoding="UTF-8"?>
<customers>
<customer>
<name>Jane Doe</name>
<bio>
<html>Jane's bio</html>
</bio>
</customer>
<customer>
<name>John Doe</name>
<bio>
<html>John's bio</html>
</bio>
</customer>
</customers>
But the output is:
Name: Jane Doe
Bio: <html>Jane's bio</html>
Name: John Doe
Bio: <html>Jane's bio</html>
BioHandler (unchanged from previous example)
package blog.domhandler;
import java.io.StringReader;
import java.io.StringWriter;
import javax.xml.bind.ValidationEventHandler;
import javax.xml.bind.annotation.DomHandler;
import javax.xml.transform.Source;
import javax.xml.transform.stream.StreamResult;
import javax.xml.transform.stream.StreamSource;
public class BioHandler implements DomHandler<String, StreamResult> {
private static final String BIO_START_TAG = "<bio>";
private static final String BIO_END_TAG = "</bio>";
private StringWriter xmlWriter = new StringWriter();
public StreamResult createUnmarshaller(ValidationEventHandler errorHandler) {
return new StreamResult(xmlWriter);
}
public String getElement(StreamResult rt) {
String xml = rt.getWriter().toString();
int beginIndex = xml.indexOf(BIO_START_TAG) + BIO_START_TAG.length();
int endIndex = xml.indexOf(BIO_END_TAG);
return xml.substring(beginIndex, endIndex);
}
public Source marshal(String n, ValidationEventHandler errorHandler) {
try {
String xml = BIO_START_TAG + n.trim() + BIO_END_TAG;
StringReader xmlReader = new StringReader(xml);
return new StreamSource(xmlReader);
} catch(Exception e) {
throw new RuntimeException(e);
}
}
}
Customer (unchanged from previous example)
package blog.domhandler;
import javax.xml.bind.annotation.XmlAnyElement;
import javax.xml.bind.annotation.XmlRootElement;
import javax.xml.bind.annotation.XmlType;
@XmlRootElement
@XmlType(propOrder={"name", "bio"})
public class Customer {
private String name;
private String bio;
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
@XmlAnyElement(BioHandler.class)
public String getBio() {
return bio;
}
public void setBio(String bio) {
this.bio = bio;
}
}
Customers
package blog.domhandler;
import java.util.List;
import javax.xml.bind.annotation.XmlAnyElement;
import javax.xml.bind.annotation.XmlRootElement;
import javax.xml.bind.annotation.XmlType;
@XmlRootElement
public class Customers {
private List<Customer> customers;
public List<Customer> getCustomer() {
return customers;
}
public void setCustomer(List<Customer> c) {
this.customers = c;
}
}
Demo (driver)
package blog.domhandler;
import java.io.File;
import javax.xml.bind.JAXBContext;
import javax.xml.bind.Marshaller;
import javax.xml.bind.Unmarshaller;
public class Demo {
public static void main(String[] args) throws Exception {
JAXBContext jc = JAXBContext.newInstance(Customers.class);
Unmarshaller unmarshaller = jc.createUnmarshaller();
Customers customers = (Customers) unmarshaller.unmarshal(new File("src/blog/domhandler/input.xml"));
for( Customer customer: customers.getCustomer() ) {
System.out.println("Name: " + customer.getName());
System.out.println("Bio: " + customer.getBio());
}
}
}
When I place a breakpoint in BioHandler.getElement(), I see that the first time its called String xml takes the value
<?xml version="1.0" encoding="UTF-8"?><bio><html>Jane's bio</html>
</bio>
while the second time it is called String xml takes the value
<?xml version="1.0" encoding="UTF-8"?><bio><html>Jane's bio</html>
</bio><?xml version="1.0" encoding="UTF-8"?><bio><html>John's bio</html>
</bio>
Is there some way to indicate to the parser that this content should be discarded after each call to BioHandler.getElement()?
回答1:
Turns out my question was answered by the first comment on the blog post this example is taken from. The code of BioHandler.createUnmarshaller() should be:
public StreamResult createUnmarshaller(ValidationEventHandler errorHandler) {
xmlWriter.getBuffer().setLength(0);
return new StreamResult(xmlWriter);
}
来源:https://stackoverflow.com/questions/23550197/domhandler-to-capture-text-for-multiple-records