get docx file contents using javascript/jquery

故事扮演 提交于 2019-11-27 06:18:31

问题


wish to open / read docx file using client side technologies (HTML/JS).

kindly assist if this is possible . have found a Javascript library named docx.js but personally cannot seem to locate any documentation for it. (http://blog.innovatejs.com/?p=184)

the goal is to make a browser based search tool for docx files and txt files .

any help appreciated.


回答1:


With docxtemplater, you can easily get the full text of a word (works with docx only) by using the doc.getFullText() method.

HTML code:

<script src="build/docxgen.js"></script>
<script src="vendor/FileSaver.min.js"></script>
<script src="vendor/jszip-utils.js"></script>
<script>
    var loadFile=function(url,callback){
        JSZipUtils.getBinaryContent(url,callback);
    }
    loadFile("examples/tagExample.docx",function(err,content){
        var doc=new Docxgen(content);
        text=doc.getFullText();
        console.log(text);
    });
</script>

Getting the source code:

git clone https://github.com/edi9999/docxtemplater.git && cd docxtemplater
# git checkout v1.0.4 # Optional
npm install -g gulp jasmine-node uglify-js browserify
npm install
gulp allCoffee
mkdir build -p
browserify -r ./js/docxgen.js -s Docxgen > build/docxgen.js
uglifyjs build/docxgen.js > build/docxgen.min.js # Optional



回答2:


If you want to be able to display the docx files in a web browser, you might be interested in Native Documents' recently released commercial Word File Editor; try it at https://nativedocuments.com/test_drive.html

You'll get much better layout fidelity if you do it this way, than if you try to convert to (X)HTML and view it that way.

It is designed specifically for embedding in a webapp, so there is an API for loading documents, and it will sit happily within the security context of your webapp.

Disclosure: I have a commercial interest in Native Documents




回答3:


I know this is an old post, but doctemplater has moved on and the accepted answer no longer works. This worked for me:

function loadDocx(filename) {
  // Read document.xml from docx document
  const AdmZip = require("adm-zip");
  const zip = new AdmZip(filename);
  const xml = zip.readAsText("word/document.xml");
  // Load xml DOM
  const cheerio = require('cheerio');
  $ = cheerio.load(xml, {
    normalizeWhitespace: true,
    xmlMode: true
  })
  // Extract text
  let out = new Array()
  $('w\\:t').each((i, el) => {
    out.push($(el).text())
  })
  return out
}


来源:https://stackoverflow.com/questions/28440170/get-docx-file-contents-using-javascript-jquery

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!