Read from word document line by line

て烟熏妆下的殇ゞ 提交于 2019-11-26 07:39:11

问题



I\'m trying to read a word document using C#. I am able to get all text but I want to be able to read line by line and store in a list and bind to a gridview. Currently my code returns a list of one item only with all text (not line by line as desired). I\'m using the Microsoft.Office.Interop.Word library to read the file. Below is my code till now:

    Application word = new Application();
    Document doc = new Document();

    object fileName = path;
    // Define an object to pass to the API for missing parameters
    object missing = System.Type.Missing;
    doc = word.Documents.Open(ref fileName,
            ref missing, ref missing, ref missing, ref missing,
            ref missing, ref missing, ref missing, ref missing,
            ref missing, ref missing, ref missing, ref missing,
            ref missing, ref missing, ref missing);

    String read = string.Empty;
    List<string> data = new List<string>();
    foreach (Range tmpRange in doc.StoryRanges)
    {
        //read += tmpRange.Text + \"<br>\";
        data.Add(tmpRange.Text);
    }
    ((_Document)doc).Close();
    ((_Application)word).Quit();

    GridView1.DataSource = data;
    GridView1.DataBind();

回答1:


Ok. I found the solution here.


The final code is as follows:

    Application word = new Application();
    Document doc = new Document();

    object fileName = path;
    // Define an object to pass to the API for missing parameters
    object missing = System.Type.Missing;
    doc = word.Documents.Open(ref fileName,
            ref missing, ref missing, ref missing, ref missing,
            ref missing, ref missing, ref missing, ref missing,
            ref missing, ref missing, ref missing, ref missing,
            ref missing, ref missing, ref missing);

    String read = string.Empty;
    List<string> data = new List<string>();
    for (int i = 0; i < doc.Paragraphs.Count; i++)
    {
        string temp = doc.Paragraphs[i + 1].Range.Text.Trim();
        if (temp != string.Empty)
            data.Add(temp);
    }
    ((_Document)doc).Close();
    ((_Application)word).Quit();

    GridView1.DataSource = data;
    GridView1.DataBind();



回答2:


The above code is correct, but it's too slow. I have improved the code, and it's much faster than the above one.

List<string> data = new List<string>();
Application app = new Application();
Document doc = app.Documents.Open(ref readFromPath);

foreach (Paragraph objParagraph in doc.Paragraphs)
    data.Add(objParagraph.Range.Text.Trim());

((_Document)doc).Close();
((_Application)app).Quit();



回答3:


How about this yo. Get all the words from the doc and split them on return or whatever is better for you. Then turn into list

   List<string> lines = doc.Content.Text.Split('\n').ToList();


来源:https://stackoverflow.com/questions/18555064/read-from-word-document-line-by-line

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!