How to know if a PDF contains only images or has been OCR scanned for searching?

前端 未结 7 1989
借酒劲吻你
借酒劲吻你 2020-12-08 10:35

I have a bunch of PDF files that came from scanned documents. The files contain a mix of images and text. Some were scanned as images with no OCR, so each PDF page is one

7条回答
  •  暖寄归人
    2020-12-08 10:49

    Use "dtsearch" to create an index for all the pdf files... then "view the log file" of the indexing process to check the list of pdf files that were not indexed.

提交回复
热议问题