Reading doc and docx files using C# without having MS Office installed on server

懵懂的女人 提交于 2019-11-28 11:22:30
Pavel Kudinov

We can now use open source, NPOI (.NET port of Apache POI) library which also supports docx, xls & xlsx. DocX is also another open source library for creating word docs.

For DOCX I'd suggest Open XML API, though Microsoft developed Open XML to create office files through the XML files communicating with this API, the latest version 2.5 was released in 2013 which is 5 years ago.

you can use Code7248.word_reader.dll

below is the sample code on how to use Code7248.word_reader.dll

add reference to this DLL in your project and copy below code.

using System;
using System.Collections.Generic;
using System.Text;
//add extra namespaces
using Code7248.word_reader;


namespace testWordRead
{
    class Program
    {
        private void readFileContent(string path)
        {
            TextExtractor extractor = new TextExtractor(path);
            string text = extractor.ExtractText();
            Console.WriteLine(text);
        }
        static void Main(string[] args)
        {
            Program cs = new Program();
            string path = "D:\Test\testdoc1.docx";
            cs.readFileContent(path);
            Console.ReadLine();
        }
    }
}

Update: NPOI supports docx now. Please try the latest release (NPOI 2.0 beta)

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!