Convert Word doc or docx files into text files?

前端 未结 11 496
难免孤独
难免孤独 2020-12-05 01:28

I need a way to convert .doc or .docx extensions to .txt without installing anything. I also don\'t want to have to manually open Wor

11条回答
  •  时光取名叫无心
    2020-12-05 01:48

    A simple Perl only solution for docx:

    1. Use Archive::Zip to get the word/document.xml file from your docx file. (A docx is just a zipped archive.)

    2. Use XML::LibXML to parse it.

    3. Then use XML::LibXSLT to transform it into text or html format. Seach the web to find a nice docx2txt.xsl file :)

    Cheers !

    J.

提交回复
热议问题