PdfBox text extraction not working properly
问题 PDFTextStripper stripper = new PDFTextStripper(); PDDocument document = PDDocument.load(inputStream); String text = stripper.getText(document); Extracted text: http://pastebin.com/BXFfMy0z Problem pdf: http://www.iwb.ch/media/Unternehmen/Dokumente/inserat_leiter_pm.pdf What can I do to extract correct text from this pdf file? 回答1: In addition to @karthik27's answer: Adobe Reader is fairly good at text extraction and, therefore, generally can be used as an indicator whether text extraction