What is the best way to programmatically check if a PDF file is a totally scanned one?
I do have iText and PDFBox at my disposal. I can check if a pdf file contains text or
IMHO you cannot decide that for sure. But you can try some things like looking for the text, trying to OCR the pdf and based on amount of recognized text decide, you can look for some basic scanning errors like fade-outs or paper/book margins.