Split and merge pdf files using PDFBOX produces large file
问题 I have this large print file in pdf that's contains 5544 pages and is about 36mb in size. The file is created by MS Word 2010 and contains only text and a logo on each letter/document. I split it into 5544 files and merge back into 2770 letters, based on keywords. Each letter is approx. 140-145kb. When I merge all the letters into a new pdf print file, still containing 5544 pages, the size of the file is grown to 396mb. All text extracting, splitting and merging is performed with calls to