pdf.js with text selection

人走茶凉 提交于 2019-12-03 07:28:24

问题


How to make the text in a PDF selectable?

Have tried here. The PDF is written fine, but no text selection

https://github.com/mozilla/pdf.js

https://github.com/mozilla/pdf.js/blob/master/web/text_layer_builder.css
https://github.com/mozilla/pdf.js/blob/master/web/text_layer_builder.js

'use strict';

PDFJS.getDocument('file.pdf').then(function(pdf){
    var page_num = 1;
    pdf.getPage(page_num).then(function(page){
        var scale = 1.5;
        var viewport = page.getViewport(scale);
        var canvas = document.getElementById('the-canvas');
        var context = canvas.getContext('2d');
        canvas.height = viewport.height;
        canvas.width = viewport.width;

        var canvasOffset = $(canvas).offset();
        var $textLayerDiv = $('#text-layer').css({
            height : viewport.height+'px',
            width : viewport.width+'px',
            top : canvasOffset.top,
            left : canvasOffset.left
        });

        page.render({
            canvasContext : context,
            viewport : viewport
        });

        page.getTextContent().then(function(textContent){
            var textLayer = new TextLayerBuilder({
                textLayerDiv : $textLayerDiv.get(0),
                pageIndex : page_num - 1,
                viewport : viewport
            });

            textLayer.setTextContent(textContent);
            textLayer.render();
        });
    });
});

<body>
  <div>
    <canvas id="the-canvas" style="border:1px solid black;"></canvas>
    <div id="text-layer" class="textLayer"></div>
  </div>
</body>

回答1:


Your javascript code is perfect. You just need to include the UI utilities that Text Layer Builder depends on:

https://github.com/mozilla/pdf.js/blob/master/web/ui_utils.js

Or in HTML:

<script src="https://raw.githubusercontent.com/mozilla/pdf.js/master/web/ui_utils.js"></script>

If you run your code (without ui_utils) and check the debug console, you will see ReferenceError: CustomStyle is not defined. A quick search in PDFjs's repo will show you it is defined in ui_utils.js.

Here is my minimal but complete code for your reference. I am using PDFjs's demo pdf here. Note that in production you should not link to raw.github.

<!DOCTYPE html><meta charset="utf-8">
<link rel="stylesheet" href="https://raw.githubusercontent.com/mozilla/pdf.js/master/web/text_layer_builder.css" />
<script src="https://ajax.googleapis.com/ajax/libs/jquery/2.1.4/jquery.min.js"></script>
<script src="https://raw.githubusercontent.com/mozilla/pdf.js/master/web/ui_utils.js"></script>
<script src="https://raw.githubusercontent.com/mozilla/pdf.js/master/web/text_layer_builder.js"></script>
<script src="https://mozilla.github.io/pdf.js/build/pdf.js"></script>
<body>
  <div>
    <canvas id="the-canvas" style="border:1px solid black;"></canvas>
    <div id="text-layer" class="textLayer"></div>
  </div>
<script>
'use strict';

PDFJS.getDocument('file.pdf').then(function(pdf){
    var page_num = 1;
    pdf.getPage(page_num).then(function(page){
        var scale = 1.5;
        var viewport = page.getViewport(scale);
        var canvas = $('#the-canvas')[0];
        var context = canvas.getContext('2d');
        canvas.height = viewport.height;
        canvas.width = viewport.width;

        var canvasOffset = $(canvas).offset();
        var $textLayerDiv = $('#text-layer').css({
            height : viewport.height+'px',
            width : viewport.width+'px',
            top : canvasOffset.top,
            left : canvasOffset.left
        });

        page.render({
            canvasContext : context,
            viewport : viewport
        });

        page.getTextContent().then(function(textContent){
           console.log( textContent );
            var textLayer = new TextLayerBuilder({
                textLayerDiv : $textLayerDiv.get(0),
                pageIndex : page_num - 1,
                viewport : viewport
            });

            textLayer.setTextContent(textContent);
            textLayer.render();
        });
    });
});
</script>



回答2:


After hours of struggling with this I found this article to be very helpful about selecting text and using pdf.js without node. Custom PDF Rendering in JavaScript with Mozilla’s PDF.Js




回答3:


Hello you have created canvas in your HTML content.

Canvas will not support text selection so you need to change the canvas to another way.



来源:https://stackoverflow.com/questions/33063213/pdf-js-with-text-selection

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!