Can JAXB parse large XML files in chunks

前端 未结 4 1363
终归单人心
终归单人心 2020-11-29 03:26

I need to parse potentially large XML files, of which the schema is already provided to me in several XSD files, so XML binding is highly favored. I\'d like to know if I can

4条回答
  •  时光说笑
    2020-11-29 03:58

    Yves Amsellem's answer is pretty good, but only works if all elements are of exactly the same type. Otherwise your unmarshall will throw an exception, but the reader will have already consumed the bytes, so you would be unable to recover. Instead, we should follow Skaffman's advice and look at the sample from the JAXB jar.

    To explain how it works:

    1. Create a JAXB unmarshaller.
    2. Add a listener to the unmarshaller for intercepting the appropriate elements. This is done by "hacking" the ArrayList to ensure the elements are not stored in memory after being unmarshalled.
    3. Create a SAX parser. This is where the streaming happens.
    4. Use the unmarshaller to generate a handler for the SAX parser.
    5. Stream!

    I modified the solution to be generic*. However, it required some reflection. If this is not OK, please look at the code samples in the JAXB jars.

    ArrayListAddInterceptor.java

    import java.lang.reflect.Field;
    import java.util.ArrayList;
    
    public class ArrayListAddInterceptor extends ArrayList {
        private static final long serialVersionUID = 1L;
    
        private AddInterceptor interceptor;
    
        public ArrayListAddInterceptor(AddInterceptor interceptor) {
            this.interceptor = interceptor;
        }
    
        @Override
        public boolean add(T t) {
            interceptor.intercept(t);
            return false;
        }
    
        public static interface AddInterceptor {
            public void intercept(T t);
        }
    
        public static void apply(AddInterceptor interceptor, Object o, String property) {
            try {
                Field field = o.getClass().getDeclaredField(property);
                field.setAccessible(true);
                field.set(o, new ArrayListAddInterceptor(interceptor));
            } catch (Exception e) {
                throw new RuntimeException(e);
            }
        }
    
    }
    

    Main.java

    public class Main {
      public void parsePurchaseOrders(AddInterceptor interceptor, List files) {
            try {
                // create JAXBContext for the primer.xsd
                JAXBContext context = JAXBContext.newInstance("primer");
    
                Unmarshaller unmarshaller = context.createUnmarshaller();
    
                // install the callback on all PurchaseOrders instances
                unmarshaller.setListener(new Unmarshaller.Listener() {
                    public void beforeUnmarshal(Object target, Object parent) {
                        if (target instanceof PurchaseOrders) {
                            ArrayListAddInterceptor.apply(interceptor, target, "purchaseOrder");
                        }
                    }
                });
    
                // create a new XML parser
                SAXParserFactory factory = SAXParserFactory.newInstance();
                factory.setNamespaceAware(true);
                XMLReader reader = factory.newSAXParser().getXMLReader();
                reader.setContentHandler(unmarshaller.getUnmarshallerHandler());
    
                for (File file : files) {
                    reader.parse(new InputSource(new FileInputStream(file)));
                }
            } catch (Exception e) {
                throw new RuntimeException(e);
            }
        }
    }
    

    *This code has not been tested and is for illustrative purposes only.

提交回复
热议问题