Indexing PDF from badly authored LaTeX source

橙三吉。 提交于 2019-12-13 18:49:32

问题


I've noticed this issue with some PDF files, that originate from LaTeX source (I assume that, from the page layout/design and fonts used).

So today I was reading such article and I couldn't copy meaningful text, nor do text search, and of course can't index the document. Here is one random example: http://www.vincent-net.com/luc/papers/00informatica_granul.pdf

Is there some procedure, I can make this kind of documents accessible. Only thing that comes to my mind is to rasterize document then do OCR as save it, but that feels just dumb.

来源:https://stackoverflow.com/questions/14474405/indexing-pdf-from-badly-authored-latex-source

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!