问题
I have some gitlab-ci.yml
files for some projects that I want to deploy. After banging my head against the wall for several hours, I've realized that they pass/fail randomly. Of course, nothing changes, it's just a matter of pressing on the "Retry" button and sometimes the job will pass and other times it will fail.
This is a .gitlab-ci.yml
that I'm working with:
image: docker:latest
services:
- docker:dind
before_script:
- echo "Runnig before_script"
- apk add --no-cache py-pip python-dev libffi-dev openssl-dev gcc libc-dev make
- pip install docker-compose
stages:
- test
- build
- deploy
test:
stage: test
script:
- echo "Testing the app"
- docker-compose run app sh -c "python /app/manage.py test && flake8"
build:
stage: build
only:
- develop
- production
- feature/deploy-debug-gitlab
script:
- echo "Building the app"
- docker-compose build
deploy:
stage: deploy
only:
- master
- feature/deploy
script:
- echo "Deploying the app"
- docker-compose up -d
environment: production
when: manual
Am I doing something wrong there? It seems quite straight forward to me.
When the job fails, I always get the error: apk command not found
. Like here:
Running with gitlab-runner 11.11.1 (5a147c92)
on My Runner Jd5HNvxy
Using Shell executor...
Running on ubuntu-512mb-lon1-01...
Reinitialized existing Git repository in /home/gitlab-runner/builds/Jd5HNvxy/0/<my.name>/<my.app>/.git/
Fetching changes...
Checking out 3f388ce6 as feature/deploy...
Skipping Git submodules setup
$ echo "Runnig before_script"
Runnig before_script
$ apk add --no-cache py-pip python-dev libffi-dev openssl-dev gcc libc-dev make
bash: line 88: apk: command not found
ERROR: Job failed: exit status 1
When the job passes, I get this:
Running with gitlab-runner 11.11.1 (5a147c92)
on docker-auto-scale fa6cab46
Using Docker executor with image docker:latest ...
Starting service docker:dind ...
Pulling docker image docker:dind ...
Using docker image sha256:bed64de70fa1f4d0b5a498791647c45d954cb0306ec2852dbcfb956f4ff3b0d6 for docker:dind ...
Waiting for services to be up and running...
Pulling docker image docker:latest ...
Using docker image sha256:af42f41a7d73a4a181843011f62cbdefa6d0f546bc7b50f71163750e0475a928 for docker:latest ...
Running on runner-fa6cab46-project-12561543-concurrent-0 via runner-fa6cab46-srm-1559293461-381b8d99...
Initialized empty Git repository in /builds/<my.name>/<my.app>/.git/
Fetching changes...
Created fresh repository.
From https://gitlab.com/<my.name>/<my.app>
* [new branch] develop -> origin/develop
* [new branch] feature/deploy -> origin/feature/deploy
* [new branch] master -> origin/master
Checking out 3f388ce6 as feature/deploy...
Skipping Git submodules setup
$ echo "Runnig before_script"
Runnig before_script
$ apk add --no-cache py-pip python-dev libffi-dev openssl-dev gcc libc-dev make
fetch http://dl-cdn.alpinelinux.org/alpine/v3.9/main/x86_64/APKINDEX.tar.gz
fetch http://dl-cdn.alpinelinux.org/alpine/v3.9/community/x86_64/APKINDEX.tar.gz
(1/30) Installing binutils (2.31.1-r2)
(2/30) Installing gmp (6.1.2-r1)
(3/30) Installing isl (0.18-r0)
...
...
...
回答1:
This happens because you have different runners, „My Runner Jd5HNvxy“ and „docker-auto-scale fa6cab46“. Your job is being executed by whichever runner is faster to grab the job.
In your failing case, your job runs on a runner marked as „shell executor“, running on an ubuntu system - visible in the logs. Ubuntu does not ship with the apk command, therefore the job fails.
Your other runner, however, uses the „docker“ executor, therefore pulling the docker image and running the job without issues.
Possible solutions:
- Remove/pause the shell runner.
- Make your runners „specific“ runners and assign them to the projects manually.
- Add tags to your runners, e.g. „shell“ and „docker“. Then, in the CI config, declare to use a properly tagged runner. See the official docs for more info: https://docs.gitlab.com/ce/ci/yaml/README.html#tags
来源:https://stackoverflow.com/questions/56392019/gitlab-ci-jobs-fail-pass-randomly