Copy+pasting text from PDF results in garbage

后端 未结 7 3034
无人及你
无人及你 2021-02-20 00:37

I am writing a Master\'s thesis - NLP system. I have one component - extractor.

It is extracting a plain text from PDF files. There are a few PDF files that can not be

7条回答
  •  星月不相逢
    2021-02-20 01:19

    When opened as a Gmail attachment in Chrome (the internal PDF browser) copying does copy normal readable characters!

    It worked for me when I had this problem and for others as well. I think the Chrome PDF viewer uses the Google Drive OCR automatically... It's like magic!

提交回复
热议问题