Get pages of word document

前端 未结 2 1297
感情败类
感情败类 2020-12-10 06:34

I\'m trying to get all pages of MSWord document via Microsoft.Office.Interop.Word (I\'m using C# in VS2012). What I would like to get is List< String > Pages, where inde

2条回答
  •  清歌不尽
    2020-12-10 07:01

    A bit of a simpler solution.

    Pseudo code:

    • Grab the page count.
    • For each page:
      • Find the characters between this pages last character index and the last pages last character index.

    Implementation:

        /// 
        /// Reads each page of the word document into a string and returns the list of the page strings.
        /// 
        public static IEnumerable ReadPages(string filePath)
        {
            ICollection pageStrings = new List();
            Microsoft.Office.Interop.Word.Application app = new Microsoft.Office.Interop.Word.Application();
            Document doc = app.Documents.Open(filePath);
    
            long pageCount = doc.ComputeStatistics(Microsoft.Office.Interop.Word.WdStatistic.wdStatisticPages);
            int lastPageEnd = 0; // The document starts at 0.
            for ( long i = 0; i < pageCount; i++)
            {
                // The "range" of the page break. This actually is a range of 0 elements, both start and end are the 
                // location of the page break.
                Range pageBreakRange = app.Selection.GoToNext(Microsoft.Office.Interop.Word.WdGoToItem.wdGoToPage);
                string currentPageText = doc.Range(lastPageEnd, pageBreakRange.End).Text;
                lastPageEnd = pageBreakRange.End;
                pageStrings.Add(currentPageText);
            }
            return pageStrings;
        }
    

提交回复
热议问题