Detect Bold, Italic and Strike Through text using PDFBox with VB.NET
问题 Is there a way to preserve the text formatting when extracting a PDF with PDFBox? I have a program that parses a PDF document for information. When a new version of the PDF is released the authors use bold or italic text to indicate new information and Strike through or underlined to indicated omitted text. Using the base Stripper class in PDFbox returns all the text but the formatting is removed so I have no way of telling if the text is new or omitted. I'm currently using the project