I am trying to merge many xml files into one. I have successfully done that in DOM, but this solution is limited to a few files. When I run it on multiple files >1000 I am getti
Dom does consume a lot of memory. You have, imho, the following alternatives.
The best one is to use SAX. Using sax, only a very small amount of memory is used, cause basically nearly a single element is travelling from input to output at any given time, so memory footprint is extremely low. However, using sax is not so simple, cause compared to dom it is a bit counterintuitive.
Try Stax, not tried myself, but it's a kind of sax on steroids easier to implement and use, cause as opposed to just receiving sax events you don't control, you actually "ask the source" to stream you the elements you want, so it fits in the middle between dom and sax, has a memory footprint similar to sax, but a more friendly paradigm.
Sax, stax, dom are all important if you want to correctly preserve, declare etc... namespaces and other XML oddities.
However, if you just need a quick and dirty way, which will probably be namespace compliant as well, use plain old strings and writers.
Start outputting to the FileWriter the declaration and the root element of your "big" document. Then load, using dom if you like, each single file. Select the elements you want to end up in the "big" file, serialize them back to a string, and send the to the writer. the writer will flush to disk without using enormous amount of memory, and dom will load only one document per iteration. Unless you also have very big files on the input side, or plan to run it on a cellphone, you should not have a lot of memory problems. If dom serializes it correctly, it should preserve namespace declarations and the like, and the code will be just a bunch of lines more than the one you posted.