Reading PDF Bookmarks in VB.NET using iTextSharp

后端 未结 2 1690
旧时难觅i
旧时难觅i 2021-01-15 02:03

I am making a tool that scans PDF files and searches for text in PDF bookmarks and body text. I am using Visual Studio 2008 with VB.NET with iTextSharp.

How do I loa

2条回答
  •  深忆病人
    2021-01-15 03:10

    It depends on what you understand when you say "bookmarks".

    You want the outlines (the entries that are visible in the bookmarks panel):

    The CreateOnlineTree examples shows you how to use the SimpleBookmark class to create an XML file containing the complete outline tree (in PDF jargon, bookmarks are called outlines).

    Java:

    PdfReader reader = new PdfReader(src);
    List> list = SimpleBookmark.getBookmark(reader);
    SimpleBookmark.exportToXML(list,
            new FileOutputStream(dest), "ISO8859-1", true);
    reader.close();
    

    C#:

    PdfReader reader = new PdfReader(pdfIn);
    var list = SimpleBookmark.GetBookmark(reader);
    using (MemoryStream ms = new MemoryStream()) {
        SimpleBookmark.ExportToXML(list, ms, "ISO8859-1", true); 
        ms.Position = 0;
        using (StreamReader sr =  new StreamReader(ms)) {
            return sr.ReadToEnd();
        }              
    } 
    

    The list object can also be used to examine the different bookmark elements one by one programmatically (this is all explained in the official documentation).

    You want the named destinations (specific places in the document you can link to by name):

    Now suppose that you meant to say named destinations, then you need the SimpleNamedDestination class as shown in the LinkActions example:

    Java:

    PdfReader reader = new PdfReader(src);
    HashMap map = SimpleNamedDestination.getNamedDestination(reader, false);
    SimpleNamedDestination.exportToXML(map, new FileOutputStream(dest),
            "ISO8859-1", true);
    reader.close();
    

    C#:

    PdfReader reader = new PdfReader(src);
    Dictionary map = SimpleNamedDestination
          .GetNamedDestination(reader, false);
    using (MemoryStream ms = new MemoryStream()) {
        SimpleNamedDestination.ExportToXML(map, ms, "ISO8859-1", true);
        ms.Position = 0;
        using (StreamReader sr =  new StreamReader(ms)) {
          return sr.ReadToEnd();
        }
    }
    

    The map object can also be used to examine the different named destinations one by one programmatically. Note the Boolean parameter that is used when retrieving the named destinations. Named destinations can be stored using a PDF name object as name, or using a PDF string object. The Boolean parameter indicates whether you want the former (true = stored as PDF name objects) or the latter (false = stored as PDF string objects) type of named destinations.

    Named destinations are predefined targets in a PDF file that can be found through their name. Although the official name is named destinations, some people refer to them as bookmarks too (but when we say bookmarks in the context of PDF, we usually want to refer to outlines).

提交回复
热议问题