CI/CD tests involving pyspark - JAVA_HOME is not set

你说的曾经没有我的故事 提交于 2020-07-09 16:25:26

问题


I am working on a project which uses pyspark, and would like to set up automated tests.

Here's what my .gitlab-ci.yml file looks like:

image: "myimage:latest"

stages:
  - Tests

pytest:
  stage: Tests
  script:
  - pytest tests/.

I built the docker image myimage using a Dockerfile such as the following (see this excellent answer):

FROM python:3.7
RUN  python --version

# Create app directory
WORKDIR /app

# copy requirements.txt
COPY local-src/requirements.txt ./


# Install app dependencies
RUN pip install -r requirements.txt

# Bundle app source
COPY src /app

However, when I run this, the gitlab CI job errors with the following:

/usr/local/lib/python3.7/site-packages/pyspark/java_gateway.py:95: in launch_gateway
    raise Exception("Java gateway process exited before sending the driver its port number")
E   Exception: Java gateway process exited before sending the driver its port number
------------------------------- Captured stderr --------------------------------
JAVA_HOME is not set

I understand that pyspark requires me to have JAVA8 or higher installed on my computer. I have this set up alright locally, but...what about during the CI process? How can I install Java so it works?

I have tried adding

RUN sudo add-apt-repository ppa:webupd8team/java
RUN sudo apt-get update
RUN apt-get install oracle-java8-installer

to the Dockerfile which created the image, but got the error

/bin/sh: 1: sudo: not found

.

How can I modify the Dockerfile so that tests using pyspark will work?


回答1:


Solution that worked for me: add

RUN apt-get update
RUN apt-get install default-jdk -y

before

RUN pip install -r requirements.txt

It then all worked as expected with no further modifications needed!

EDIT

To make this work, I've had to update my base image to python:3.7-stretch




回答2:


Write in your .bash_profile:

export JAVA_HOME=(the home directory in your jdk i.e. /Library/Java/JavaVirtualMachines/[yourjdk]/Contents/Home)



来源:https://stackoverflow.com/questions/57676684/ci-cd-tests-involving-pyspark-java-home-is-not-set

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!