get docx file contents using javascript/jquery

淺唱寂寞╮ 提交于 2019-11-28 11:42:36
edi9999

With docxtemplater, you can easily get the full text of a word (works with docx only) by using the doc.getFullText() method.

HTML code:

<script src="build/docxgen.js"></script>
<script src="vendor/FileSaver.min.js"></script>
<script src="vendor/jszip-utils.js"></script>
<script>
    var loadFile=function(url,callback){
        JSZipUtils.getBinaryContent(url,callback);
    }
    loadFile("examples/tagExample.docx",function(err,content){
        var doc=new Docxgen(content);
        text=doc.getFullText();
        console.log(text);
    });
</script>

Getting the source code:

git clone https://github.com/edi9999/docxtemplater.git && cd docxtemplater
# git checkout v1.0.4 # Optional
npm install -g gulp jasmine-node uglify-js browserify
npm install
gulp allCoffee
mkdir build -p
browserify -r ./js/docxgen.js -s Docxgen > build/docxgen.js
uglifyjs build/docxgen.js > build/docxgen.min.js # Optional

If you want to be able to display the docx files in a web browser, you might be interested in Native Documents' recently released commercial Word File Editor; try it at https://nativedocuments.com/test_drive.html

You'll get much better layout fidelity if you do it this way, than if you try to convert to (X)HTML and view it that way.

It is designed specifically for embedding in a webapp, so there is an API for loading documents, and it will sit happily within the security context of your webapp.

Disclosure: I have a commercial interest in Native Documents

Brian Dobby

I know this is an old post, but doctemplater has moved on and the accepted answer no longer works. This worked for me:

function loadDocx(filename) {
  // Read document.xml from docx document
  const AdmZip = require("adm-zip");
  const zip = new AdmZip(filename);
  const xml = zip.readAsText("word/document.xml");
  // Load xml DOM
  const cheerio = require('cheerio');
  $ = cheerio.load(xml, {
    normalizeWhitespace: true,
    xmlMode: true
  })
  // Extract text
  let out = new Array()
  $('w\\:t').each((i, el) => {
    out.push($(el).text())
  })
  return out
}
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!