Extract text from doc and docx

后端 未结 9 1317
死守一世寂寞
死守一世寂寞 2020-11-27 16:24

I would like to know how can I read the contents of a doc or docx. I\'m using a Linux VPS and PHP, but if there is a simpler solution using other language, please let me kno

9条回答
  •  旧巷少年郎
    2020-11-27 17:07

    You can use Apache Tika as complete solution it provides REST API.

    Another good library is RawText, as it can do an OCR over images, and extract text from any doc. It's non-free, and it works over REST API.

    The sample code extracting your file with RawText:

    $result = $rawText->extract($your_file)
    

提交回复
热议问题