Gitlab CI jobs fail/pass randomly

一世执手 提交于 2019-12-23 05:19:51

问题


I have some gitlab-ci.yml files for some projects that I want to deploy. After banging my head against the wall for several hours, I've realized that they pass/fail randomly. Of course, nothing changes, it's just a matter of pressing on the "Retry" button and sometimes the job will pass and other times it will fail.

This is a .gitlab-ci.yml that I'm working with:

image: docker:latest

services:
  - docker:dind

before_script:
  - echo "Runnig before_script"
  - apk add --no-cache py-pip python-dev libffi-dev openssl-dev gcc libc-dev make
  - pip install docker-compose

stages:
  - test
  - build
  - deploy

test:
  stage: test
  script:
    - echo "Testing the app"
    - docker-compose run app sh -c "python /app/manage.py test && flake8"

build:
  stage: build
  only:
    - develop
    - production
    - feature/deploy-debug-gitlab

  script:
    - echo "Building the app"
    - docker-compose build

deploy:
  stage: deploy
  only:
    - master
    - feature/deploy
  script:
    - echo "Deploying the app"
    - docker-compose up -d
  environment: production
  when: manual

Am I doing something wrong there? It seems quite straight forward to me.

When the job fails, I always get the error: apk command not found. Like here:

Running with gitlab-runner 11.11.1 (5a147c92)
  on My Runner Jd5HNvxy
Using Shell executor...
Running on ubuntu-512mb-lon1-01...
Reinitialized existing Git repository in /home/gitlab-runner/builds/Jd5HNvxy/0/<my.name>/<my.app>/.git/
Fetching changes...
Checking out 3f388ce6 as feature/deploy...
Skipping Git submodules setup
$ echo "Runnig before_script"
Runnig before_script
$ apk add --no-cache py-pip python-dev libffi-dev openssl-dev gcc libc-dev make
bash: line 88: apk: command not found
ERROR: Job failed: exit status 1

When the job passes, I get this:

Running with gitlab-runner 11.11.1 (5a147c92)
  on docker-auto-scale fa6cab46
Using Docker executor with image docker:latest ...
Starting service docker:dind ...
Pulling docker image docker:dind ...
Using docker image sha256:bed64de70fa1f4d0b5a498791647c45d954cb0306ec2852dbcfb956f4ff3b0d6 for docker:dind ...
Waiting for services to be up and running...
Pulling docker image docker:latest ...
Using docker image sha256:af42f41a7d73a4a181843011f62cbdefa6d0f546bc7b50f71163750e0475a928 for docker:latest ...
Running on runner-fa6cab46-project-12561543-concurrent-0 via runner-fa6cab46-srm-1559293461-381b8d99...
Initialized empty Git repository in /builds/<my.name>/<my.app>/.git/
Fetching changes...
Created fresh repository.
From https://gitlab.com/<my.name>/<my.app>
 * [new branch]      develop        -> origin/develop
 * [new branch]      feature/deploy -> origin/feature/deploy
 * [new branch]      master         -> origin/master
Checking out 3f388ce6 as feature/deploy...

Skipping Git submodules setup
$ echo "Runnig before_script"
Runnig before_script
$ apk add --no-cache py-pip python-dev libffi-dev openssl-dev gcc libc-dev make
fetch http://dl-cdn.alpinelinux.org/alpine/v3.9/main/x86_64/APKINDEX.tar.gz
fetch http://dl-cdn.alpinelinux.org/alpine/v3.9/community/x86_64/APKINDEX.tar.gz
(1/30) Installing binutils (2.31.1-r2)
(2/30) Installing gmp (6.1.2-r1)
(3/30) Installing isl (0.18-r0)
...
...
...

回答1:


This happens because you have different runners, „My Runner Jd5HNvxy“ and „docker-auto-scale fa6cab46“. Your job is being executed by whichever runner is faster to grab the job.

In your failing case, your job runs on a runner marked as „shell executor“, running on an ubuntu system - visible in the logs. Ubuntu does not ship with the apk command, therefore the job fails.

Your other runner, however, uses the „docker“ executor, therefore pulling the docker image and running the job without issues.

Possible solutions:

  • Remove/pause the shell runner.
  • Make your runners „specific“ runners and assign them to the projects manually.
  • Add tags to your runners, e.g. „shell“ and „docker“. Then, in the CI config, declare to use a properly tagged runner. See the official docs for more info: https://docs.gitlab.com/ce/ci/yaml/README.html#tags


来源:https://stackoverflow.com/questions/56392019/gitlab-ci-jobs-fail-pass-randomly

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!