AWS lambda tar file extraction doesn't seem to work

本秂侑毒 提交于 2021-01-28 11:27:38

问题


I'm trying to run serverless LibreOffice based on this tutorial. Here is the full python lambda function:

import boto3
import os

s3_bucket = boto3.resource("s3").Bucket("lambda-libreoffice-demo")
os.system("curl https://s3.amazonaws.com/lambda-libreoffice-demo/lo.tar.gz -o /tmp/lo.tar.gz && cd /tmp && tar -xf /tmp/lo.tar.gz")
convertCommand = "instdir/program/soffice --headless --invisible --nodefault --nofirststartwizard --nolockcheck --nologo --norestore --convert-to pdf --outdir /tmp"

def lambda_handler(event,context):
  inputFileName = event['filename']
  # Put object wants to be converted in s3
  with open(f'/tmp/{inputFileName}', 'wb') as data:
      s3_bucket.download_fileobj(inputFileName, data)

  # Execute libreoffice to convert input file
  os.system(f"cd /tmp && {convertCommand} {inputFileName}")

  # Save converted object in S3
  outputFileName, _ = os.path.splitext(inputFileName)
  outputFileName = outputFileName  + ".pdf"
  f = open(f"/tmp/{outputFileName}","rb")
  s3_bucket.put_object(Key=outputFileName,Body=f,ACL="public-read")
  f.close()

The response when running the full scripts is:
"errorMessage": "ENOENT: no such file or directory, open '/tmp/example.pdf'",

So I began to debug it row by row.
Based on my debug prints, it seems that it fails right on the start, when trying to extract the binary on the second row:

os.path.exists('/tmp/lo.tar.gz') // => true
os.path.exists('/tmp/instdir/program/soffice.bin') // => false

So it looks like the tar is the problematic part there. If I download the file from S3 and run the tar command locally it seems to extract the file just fine.

Tried with node, python 3.8, python 3.6. Also tried it with and without the layer (and the /opt/lo.tar.br path) as described here.


回答1:


I ran into the same issue.

I suspect the problem is a permissions error executing files in /tmp.

Try copying instdir/ to your home folder & running it out of there.

Please write back to confirm if you test this!

I ended up creating a Docker container which installs LibreOffice properly, e.g.:

# Use Amazon Linux 2 (It's based on CentOS) as base image
FROM amazon/aws-lambda-provided:al2

# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# Download and install LibreOffice (and deps)

RUN yum update -y \
    && yum clean all \
    && yum install -y wget tar gzip

RUN cd /tmp \
    && wget http://download.documentfoundation.org/libreoffice/stable/7.0.4/rpm/x86_64/LibreOffice_7.0.4_Linux_x86-64_rpm.tar.gz \
    && tar -xvf LibreOffice_7.0.4_Linux_x86-64_rpm.tar.gz

# For some reason we need to "clean all"
RUN cd /tmp/LibreOffice_7.0.4.2_Linux_x86-64_rpm/RPMS \
    && yum clean all \
    && yum -y localinstall *.rpm 

# Required deps for soffice
RUN yum -y install \
    fontconfig libXinerama.x86_64 cups-libs dbus-glib cairo libXext libSM libXrender

# NOTE: Should we install libreoffice-writer? (doesn't seem to be required)

# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

# We need to read/write to S3 bucket
RUN yum -y install \
    awscli \
    jq

# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

# We test with this file
COPY test-template.docx /home/test-template.docx

# This code derives from Ari's original article
COPY process_doc.sh     /home/process_doc.sh
COPY bootstrap          /var/runtime/bootstrap
COPY function.sh        /var/task/function.sh

RUN chmod u+rx \
    /home/process_doc.sh \
    /var/runtime/bootstrap \
    /var/task/function.sh

CMD [ "function.sh.handler" ]
# ^ Why CMD not ENTRYPOINT

... and running a containerized lambda: https://github.com/p-i-/lambda-container-image-with-custom-runtime-example



来源:https://stackoverflow.com/questions/65884502/aws-lambda-tar-file-extraction-doesnt-seem-to-work

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!