Flask / postgres - display pdf with PDFJS

泪湿孤枕 提交于 2019-12-24 18:39:25

问题


I have a very simple application. A user uploads a pdf file to a postgres database via the web front end. That pdf should then be rendered in the browser via pdfjs.

I'm fairly certain my issue is an encoding one, but I don't think I understand encoding well enough to answer this on my own.

My model:

class Lesson(Base):
    __tablename__ = 'lessons'

    # Name of the lesson
    lesson_order = db.Column(db.Enum(LessonIndexes), nullable=False)
    name = db.Column(db.String(128), nullable=False)
    summary = db.Column(db.String(500))
    lesson_plan_id = db.Column(db.Integer(), ForeignKey('lesson_plans.id'), nullable=False)
    pdf = db.Column(db.LargeBinary())

My Controller:

@mod_lp.route('/<lesson_plan_id>/create_lesson', methods=["POST"])
def create_lesson(lesson_plan_id):
    form = LessonForm()
    file = request.files['pdf']  # type: FileStorage

    if form.validate_on_submit():
        file = request.files['pdf']
        lesson = Lesson(form.lesson_order.data, form.name.data, form.summary.data, lesson_plan_id,
                        pdf=file.read() # this line here
                        )
        db.session.add(lesson)
        db.session.commit()
    return redirect(url_for('lesson_plan.show', lesson_plan_id=lesson_plan_id))

This stores the data to look something like:

%PDF-1.4
%����
1 0 obj
<</Creator (Mozilla/5.0 \(Macintosh; Intel Mac OS X 10_12_6\) AppleWebKit/537.36 \(KHTML, like Gecko\) Chrome/60.0.3112.113 Safari/537.36)
/Producer (Skia/PDF m60)
/CreationDate (D:20170916222407+00'00')
/ModDate (D:20170916222407+00'00')>>
endobj
2 0 obj
<</Filter /FlateDecode
/Length 1370>> stream
x���ݎ�4��<�������   qq$8�@%`aB�H�_�����T�E���ړ�c'�t�Z��[������}�{�I���@���

(etc...)

my javasript (taken from PDFJS, hello world):

var pdfString = "{{ pdf_data}}";
var pdfData = atob(pdfString);
if (pdfData) {
    var loadingTask = PDFJS.getDocument({data: pdfData});
    loadingTask.promise.then(function (pdf) {
        console.log('PDF loaded');

        // Fetch the first page
        var pageNumber = 1;
        pdf.getPage(pageNumber).then(function (page) {
            console.log('Page loaded');

            var scale = 1.5;
            var viewport = page.getViewport(scale);

            // Prepare canvas using PDF page dimensions
            var canvas = document.getElementById('pdf-canvas');
            var context = canvas.getContext('2d');
            canvas.height = viewport.height;
            canvas.width = viewport.width;

            // Render PDF page into canvas context
            var renderContext = {
                canvasContext: context,
                viewport: viewport
            };
            var renderTask = page.render(renderContext);
            renderTask.then(function () {
                console.log('Page rendered');
            });
        });
    }, function (reason) {
        // PDF loading error
        console.error(reason);
    });

The current error I have is:

6:108 Uncaught DOMException: Failed to execute 'atob' on 'Window': The string to be decoded is not correctly encoded.

things i've tried:

file.stream.getvalue()

file.stream.getvalue().decode("latin-1") # for whatever reason, this was the only 'decode' that didn't throw an error

file.stream.getvalue().decode("latin-1").encode()

base64.b64encode(file.stream.getvalue().decode("latin-1").encode())

but these all failed in various ways. UPDATE:

If I send the binary data in the database to my template:

pdf_data = lesson.pdf

and forget about calling atob on it:

var pdfData = pdfString;
        if (pdfData) {
...

I get this error:

Error: Invalid XRef stream header
pdf.worker.js:340     at error (http://0.0.0.0:8080/static/js/pdfjs/build/pdf.worker.js:340:17)
    at XRef_readXRef [as readXRef] (http://0.0.0.0:8080/static/js/pdfjs/build/pdf.worker.js:20943:13)
    at XRef_parse [as parse] (http://0.0.0.0:8080/static/js/pdfjs/build/pdf.worker.js:20613:28)
    at PDFDocument_setup [as setup] (http://0.0.0.0:8080/static/js/pdfjs/build/pdf.worker.js:26445:17)
    at PDFDocument_parse [as parse] (http://0.0.0.0:8080/static/js/pdfjs/build/pdf.worker.js:26336:12)
    at http://0.0.0.0:8080/static/js/pdfjs/build/pdf.worker.js:36120:28
    at Promise (<anonymous>)
    at LocalPdfManager_ensure [as ensure] (http://0.0.0.0:8080/static/js/pdfjs/build/pdf.worker.js:36115:14)
    at LocalPdfManager.BasePdfManager_ensureDoc [as ensureDoc] (http://0.0.0.0:8080/static/js/pdfjs/build/pdf.worker.js:36067:19)

回答1:


atob expects a base64 encoded string. I got a basic example to at least get a successful call to atob. Pretty sure this is the issue that you are seeing though. You could probably just save the base64 encoded content in that postgres table so that you don't need to decode it all of the time. The 'source.pdf' is just a sample pdf I had on disk. However you can swap this in with data from your postgres table.

flask_app.py

from flask import Flask, request, render_template
import base64

app = Flask(__name__)


@app.route("/testing", methods=["GET"])
def get_test_file():
    with open("source.pdf", "rb") as data_file:
        data = data_file.read()
    encoded_data = base64.b64encode(data).decode('utf-8')
    return render_template("test.html", encoded_data=encoded_data)

test.html

<html>
<head>
</head>
<body>
  <script>
    var encoded_data = '{{ encoded_data }}';
    var pdf_data = atob(encoded_data);
  </script>
</body>
</html>


来源:https://stackoverflow.com/questions/46265079/flask-postgres-display-pdf-with-pdfjs

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!