Indexing Word Documents and PDFs with Sphinx

前端 未结 3 1646
醉话见心
醉话见心 2020-12-14 11:43

I have a website where users upload documents in .doc and .pdf format. I am using Sphinx to conduct full text searches on my SQL database (MySQL). What is the best way to

3条回答
  •  被撕碎了的回忆
    2020-12-14 12:37

    The method I use for this is pdf2text and antiword. I use both of these to dump the contents of the pdfs and word documents into the database. From there it's easy to crawl with Sphinx.

提交回复
热议问题