Render .pdf to single Canvas using pdf.js and ImageData

前端 未结 3 1146
没有蜡笔的小新
没有蜡笔的小新 2020-12-13 16:15

I am trying to read an entire .pdf Document using PDF.js and then render all the pages on a single canvas.

My idea: render each page onto a canvas and get the ImageD

3条回答
  •  感情败类
    2020-12-13 17:01

    The PDF operations are asynchronous at all stages. This means you also need to catch the promise at the last render as well. If you not catch it you will only get a blank canvas as the rendering isn't finished before the loop continues to the next page.

    Tip: I would also recommend that you use something else than getImageData as this will store uncompressed bitmap, for example the data-uri instead which is compressed data.

    Here is a slightly different approach eliminating the for-loop and uses the promises better for this purpose:

    LIVE FIDDLE

    var canvas = document.createElement('canvas'), // single off-screen canvas
        ctx = canvas.getContext('2d'),             // to render to
        pages = [],
        currentPage = 1,
        url = 'path/to/document.pdf';              // specify a valid url
    
    PDFJS.getDocument(url).then(iterate);   // load PDF document
    
    /* To avoid too many levels, which easily happen when using chained promises,
       the function is separated and just referenced in the first promise callback
    */
    
    function iterate(pdf) {
    
        // init parsing of first page
        if (currentPage <= pdf.numPages) getPage();
    
        // main entry point/function for loop
        function getPage() {
    
            // when promise is returned do as usual
            pdf.getPage(currentPage).then(function(page) {
    
                var scale = 1.5;
                var viewport = page.getViewport(scale);
    
                canvas.height = viewport.height;
                canvas.width = viewport.width;
    
                var renderContext = {
                    canvasContext: ctx,
                    viewport: viewport
                };
    
                // now, tap into the returned promise from render:
                page.render(renderContext).then(function() {
    
                    // store compressed image data in array
                    pages.push(canvas.toDataURL());
    
                    if (currentPage < pdf.numPages) {
                        currentPage++;
                        getPage();        // get next page
                    }
                    else {
                        done();           // call done() when all pages are parsed
                    }
                });
            });
        }
    
    }
    

    When you then need to retrieve a page you simply create an image element and set the data-uri as source:

    function drawPage(index, callback) {
        var img = new Image;
        img.onload = function() {
            /* this will draw the image loaded onto canvas at position 0,0
               at the optional width and height of the canvas.
               'this' is current image loaded 
            */
            ctx.drawImage(this, 0, 0, ctx.canvas.width, ctx.canvas.height);
            callback();          // invoke callback when we're done
        }
        img.src = pages[index];  // start loading the data-uri as source
    }
    

    Due to the image loading it will be asynchronous in nature as well which is why we need the callback. If you don't want the asynchronous nature then you could also do this step (creating and setting the image element) in the render promise above storing image elements instead of data-uris.

    Hope this helps!

提交回复
热议问题