Reading .docx file in java

时光总嘲笑我的痴心妄想 提交于 2019-12-02 11:02:38
import java.io.File;
import java.io.FileInputStream;
import java.util.List;
import org.apache.poi.xwpf.usermodel.XWPFDocument;
import org.apache.poi.xwpf.usermodel.XWPFParagraph;
    public void readDocxFile() {
            try {
                File file = new File("C:/NetBeans Output/documentx.docx");
                FileInputStream fis = new FileInputStream(file.getAbsolutePath());

                XWPFDocument document = new XWPFDocument(fis);

                List<XWPFParagraph> paragraphs = document.getParagraphs();


                for (XWPFParagraph para : paragraphs) {
                    System.out.println(para.getText());
                }
                fis.close();
            } catch (Exception e) {
                e.printStackTrace();
            }
        }

Internally .docx files are organized as zipped XML-files, whereas .doc is a binary file format. So you can not read either one directly. Have a look at docx4j or Apache POI.

If you are trying to create or manipulate a .docx file, try docx4j Here is the source

or go for apachePOI

You may want to check Apache POI.

vkrams

You cannot read the docx file or doc file directly. You need to have an API to read word files. Use Apache POI http://poi.apache.org/. If you get any doubts, please refer this thread on stackoverflow.com How read Doc or Docx file in java?

Swati Pisal

you must have following 6 jar:

  1. xmlbeans-2.3.0.jar
  2. dom4j-1.6.1.jar
  3. poi-ooxml-3.8-20120326.jar
  4. poi-ooxml-schemas-3.8-20120326.jar
  5. poi-scratchpad-3.2-FINAL.jar
  6. poi-3.5-FINAL.jar

Code:

import java.io.File;
import java.io.FileInputStream;
import java.util.Iterator;
import java.util.List;
import org.apache.poi.hwpf.HWPFDocument;
import org.apache.poi.hwpf.extractor.WordExtractor;
import org.apache.poi.xwpf.usermodel.XWPFDocument;
import org.apache.poi.xwpf.usermodel.XWPFParagraph;

public class test {
 public static void readDocxFile(String fileName) {
try {
File file = new File(fileName);
FileInputStream fis = new FileInputStream(file.getAbsolutePath());
XWPFDocument document = new XWPFDocument(fis);
for(int i=0;i<paragraphs.size();i++){
    System.out.println(paragraphs.get(i).getParagraphText());
}
fis.close();
} catch (Exception e) {
e.printStackTrace();
}
}
public static void main(String[] args) {
 readDocxFile("C:\\Users\\sp0c43734\\Desktop\\SwatiPisal.docx");
 }
} 
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!