How to locally debug dependencies in a lambda layer?

限于喜欢 提交于 2021-01-07 03:10:59

问题


I'm creating a lambda layer from a dockerfile that installs python packages to a directory and zips the result.

FROM amazonlinux

WORKDIR /
RUN yum update -y

# Install Python 3.7
RUN yum install python3 zip -y

RUN pip3.7 install --upgrade pip

# Install Python packages
RUN mkdir /packages
RUN echo "opencv-python" >> /packages/requirements.txt

RUN mkdir -p /packages/opencv-python-3.7/python/lib/python3.7/site-packages
RUN pip3.7 install -r /packages/requirements.txt -t /packages/opencv-python-3.7/python/lib/python3.7/site-packages


# Create zip files for Lambda Layer deployment
WORKDIR /packages/opencv-python-3.7/
RUN zip -r9 /packages/cv2-python37.zip .
WORKDIR /packages/
RUN rm -rf /packages/opencv-python-3.7/

For this Dockerfile I can successfully deploy.
Now I want to add more libraries¹ but despite successful docker build and upload, there is an error when executing the lambda function (numpy not found). I would like an easier way to debug this than changing the Docker file, building, extracting and uploading the zip file and pressing 'test' in the AWS management console.

I've tried to run the docker container locally and just install the packages there and see if everything can be imported in a python shell but I cannot even recreate the original this way:

bash-4.2# pip3.7 install opencv-python
Collecting opencv-python
  Using cached opencv_python-4.4.0.42-cp37-cp37m-manylinux2014_x86_64.whl (49.4 MB)
Collecting numpy>=1.14.5
  Using cached numpy-1.19.1-cp37-cp37m-manylinux2010_x86_64.whl (14.5 MB)
Installing collected packages: numpy, opencv-python
Successfully installed numpy-1.19.1 opencv-python-4.4.0.42
bash-4.2# python3.7
Python 3.7.8 (default, Jul 24 2020, 20:26:49) 
[GCC 7.3.1 20180712 (Red Hat 7.3.1-9)] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> import cv2
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib64/python3.7/site-packages/cv2/__init__.py", line 5, in <module>
    from .cv2 import *
ImportError: libGL.so.1: cannot open shared object file: No such file or directory

How can I figure out the right dependencies on my local machine?

Update

I made it work with the versions below, but it would still be interesting to know how to test this locally.

¹ specifically I want the following packages:

opencv-python==3.4.3.18
scipy==1.4.1
scikit-learn==0.22.2.post1

回答1:


The problem you have here is that when opencv is installed its dependencies aren't going to be installed to your -t target location. They're getting installed to the default pip install location of <somewhere>/site-packages/ in the Docker image.

So when you end up zipping up your target location you're missing all of the dependencies. I would solve this by not providing a target to pip when you install opencv. Install it as you would any other package.

From within your Docker image call python -m site --user-site to get the pip install location.

Modify your Docker commands to zip up that entire directory after opencv has been installed, then use that for your zip to Lambda.




回答2:


I was able to use the more recent versions of scikit-learn and cv2 with the following approach that would automate the deployment process, and automatically reduce the size of the packages by removing unnecessary files **/*.py[c|o], **/__pycache__*, **/*.dist-info*:

I had to package both cv2 and scipy, where the package size was a huge problem, and I came to the following solutions in the end.

Using the serverless-python-requirements package on Serverless helped me streamline this whole process and reduce the package size as well. Would definitely recommend checking it out.

This is the guide that I followed

Serverless python-requirements plugin

Make sure to leave the strip flag to false to avoid stripping binaries which leads to the problem "ELF load command address/offset not properly aligned",

This is what my final serverless.yml came out to be which gave me the results I wanted to package sklearn + cv2 as a layer:

custom:
  pythonRequirements:
    dockerizePip: true
    useDownloadCache: true
    useStaticCache: false
    slim: true
    strip: false
    layer:
      name: ${self:provider.stage}-cv2-sklearn
      description: Python requirements lambda layer
      compatibleRuntimes:
        - python3.8
      allowedAccounts:
        - '*'

requirements.txt:

opencv-python-headless==4.4.0.42
scikit-learn==0.23.2


来源:https://stackoverflow.com/questions/63535960/how-to-locally-debug-dependencies-in-a-lambda-layer

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!