Parsing PDF files (especially with tables) with PDFBox
问题 I need to parse a PDF file which contains tabular data. I\'m using PDFBox to extract the file text to parse the result (String) later. The problem is that the text extraction doesn\'t work as I expected for tabular data. For example, I have a file which contains a table like this (7 columns: the first two always have data, only one Complexity column has data, only one Financing column has data): +----------------------------------------------------------------+ | AIH | Value | Complexity |