Check if a PDF file is a scanned one

后端 未结 6 896
傲寒
傲寒 2020-12-09 20:08

What is the best way to programmatically check if a PDF file is a totally scanned one? I do have iText and PDFBox at my disposal. I can check if a pdf file contains text or

6条回答
  •  难免孤独
    2020-12-09 21:03

    IMHO you cannot decide that for sure. But you can try some things like looking for the text, trying to OCR the pdf and based on amount of recognized text decide, you can look for some basic scanning errors like fade-outs or paper/book margins.

提交回复
热议问题