问题
I've noticed this issue with some PDF files, that originate from LaTeX source (I assume that, from the page layout/design and fonts used).
So today I was reading such article and I couldn't copy meaningful text, nor do text search, and of course can't index the document. Here is one random example: http://www.vincent-net.com/luc/papers/00informatica_granul.pdf
Is there some procedure, I can make this kind of documents accessible. Only thing that comes to my mind is to rasterize document then do OCR as save it, but that feels just dumb.
来源:https://stackoverflow.com/questions/14474405/indexing-pdf-from-badly-authored-latex-source