How can I get metadata from pdf document using pdf.js

孤街醉人 提交于 2020-02-20 04:52:18

问题


Is there any way to get metadata from pdf document like author or title using pdf.js?

In this example : http://mozilla.github.io/pdf.js/web/viewer.html?file=compressed.tracemonkey-pldi-09.pdf

<div class="row">
<span data-l10n-id="document_properties_author">
    Autor:
</span>
<p id="authorField">
    -
</p>

And the authorField is empty. Is there any way to get this info?


回答1:


Using just the PDF.js library without a thirdparty viewer, you can get metadata like so, utilizing promises.

PDFJS.getDocument(url).then(function (pdfDoc_) {
        pdfDoc = pdfDoc_;   
        pdfDoc.getMetadata().then(function(stuff) {
            console.log(stuff); // Metadata object here
        }).catch(function(err) {
           console.log('Error getting meta data');
           console.log(err);
        });

       // Render the first page or whatever here
       // More code . . . 
    }).catch(function(err) {
        console.log('Error getting PDF from ' + url);
        console.log(err);
    });

I found this out after dumping the pdfDoc object to the console and looking through its functions and properties. I found the method in its prototype and decided to just give it a shot. Lo and behold it worked!




回答2:


You can get document basic metadata info from PDFViewerApplication.documentInfo object. For eg: to get Author use PDFViewerApplication.documentInfo.Author




回答3:


pdfDoc.getMetadata(url).then(function(stuff) {
    var metadata = stuff.info.Title;
    if (metadata) {
        $('#element-html').text(stuff.info.Title); // Print metadata to html
    }
console.log(stuff); // Print metadata to console
}).catch(function(err) {
     console.log('Error getting meta data');
     console.log(err);
});


来源:https://stackoverflow.com/questions/22743491/how-can-i-get-metadata-from-pdf-document-using-pdf-js

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!