How to find the page number from a paragraph using OpenXML?

余生颓废 提交于 2020-01-01 11:31:46

问题


For a Paragraph object, how can I determine on which page this is located using the Open XML SDK 2.0 for Microsoft Office ?


回答1:


It is not possible to get page numbers for a word document using OpanXml Sdk as this is handled by the client (like MS Word).

However if the document you are working with is previously opened by a word client and saved back, then the client will add LastRenderedPageBreak to identify the page breaks. Refer to my answer here for more info about LastRenderedPageBreaks. This enables you to count for the number of LastRenderedPageBreak elements before your paragraph to get the current page count.

If this is not the case then the noddy option to work around your requirement is to add footers with page numbers (may be with same colour as your documents to virtually hide it!). Only an option - if you are automating the word document generation using OpenXML sdk.




回答2:


@Flowerking : thanks for the information.

Because I need to loop all the paragraphs anyway to search for a certain string, I can use the following code to find the page number:

using (var document = WordprocessingDocument.Open(@"c:\test.docx", false))
{
    var paragraphInfos = new List<ParagraphInfo>();

    var paragraphs = document.MainDocumentPart.Document.Descendants<Paragraph>();

    int pageIdx = 1;
    foreach (var paragraph in paragraphs)
    {
        var run = paragraph.GetFirstChild<Run>();

        if (run != null)
        {
            var lastRenderedPageBreak = run.GetFirstChild<LastRenderedPageBreak>();
            var pageBreak = run.GetFirstChild<Break>();
            if (lastRenderedPageBreak != null || pageBreak != null)
            {
                pageIdx++;
            }
        }

        var info = new ParagraphInfo
        {
            Paragraph = paragraph,
            PageNumber = pageIdx
        };

        paragraphInfos.Add(info);
    }

    foreach (var info in paragraphInfos)
    {
        Console.WriteLine("Page {0}/{1} : '{2}'", info.PageNumber, pageIdx, info.Paragraph.InnerText);
    }
}



回答3:


Here's an extension method I made for that :

    public static int GetPageNumber(this OpenXmlElement elem, OpenXmlElement root)
    {
        int pageNbr = 1;
        var tmpElem = elem;
        while (tmpElem != root)
        {
            var sibling = tmpElem.PreviousSibling();
            while (sibling != null)
            {
                pageNbr += sibling.Descendants<LastRenderedPageBreak>().Count();
                sibling = sibling.PreviousSibling();
            }
            tmpElem = tmpElem.Parent;
        }
        return pageNbr;
    }


来源:https://stackoverflow.com/questions/14936701/how-to-find-the-page-number-from-a-paragraph-using-openxml

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!