Jenkins Kubernetes plugin failing to provision jnlp-slave pods

别说谁变了你拦得住时间么 提交于 2020-08-24 09:10:53

问题


I have a Kubernetes 1.10.0, Docker 17.03.2-ce, and Jenkins 2.107.1 running on an Ubuntu 17.04 VM with Kubernetes Plugin 1.5 installed in Jenkins. I have 4 other Ubuntu VM(s) successfully set up as nodes in the cluster, including the untainted master. I can deploy nginx-based services directly and have unfettered access to the dashboard. So, Kubernetes itself seems happy enough.

Before you mention it, let me say that we don't have short term plans to run Jenkins master inside Kubernetes itself. So, I'd prefer to get this strategy working.

The plugin config for a Kubernetes Cloud is thus:

"Name": kubernetes

"Kubernetes URL": https://172.20.43.30:6443

from

# kubectl describe pods/kube-apiserver-jenkins-kube-master --namespace=kube-system | grep Liveness
Liveness:     http-get https://172.20.43.30:6443/healthz delay=15s timeout=15s period=10s #success=1 #failure=8

after accepting the insecure cert, a browser to https://172.20.43.30:6443/ will show

{
  "kind": "Status",
  "apiVersion": "v1",
  "metadata": {

  },
  "status": "Failure",
  "message": "forbidden: User \"system:anonymous\" cannot get path \"/\"",
  "reason": "Forbidden",
  "details": {

  },
  "code": 403
}

"Kubernetes server certificate key" obtained from

# kubectl get pods/kube-apiserver-jenkins-kube-master -o yaml --namespace=kube-system | grep tls
    - --tls-private-key-file=/etc/kubernetes/pki/apiserver.key
    - --tls-cert-file=/etc/kubernetes/pki/apiserver.crt

# cat /etc/kubernetes/pki/apiserver.crt
-----BEGIN CERTIFICATE-----
MIIDZ******
*******************
****PP5wigl
-----END CERTIFICATE-----

"Kubernetes Namespace": jenkins-slaves

the jenkins-slaves namespace setup like this ...

create jenkins-namespace.yaml and add this:

apiVersion: v1
kind: Namespace
metadata:
  name: jenkins-slaves
  labels:
    name: jenkins-slaves
spec:
  finalizers:
  - kubernetes

then

# kubectl create -f jenkins-namespace.yaml
namespace "jenkins-slaves" created

# kubectl -n jenkins-slaves create sa jenkins
serviceaccount "jenkins" created

# kubectl create role jenkins --verb=get,list,watch,create,patch,delete --resource=pods
role.rbac.authorization.k8s.io "jenkins" created

# kubectl create rolebinding jenkins --role=jenkins --serviceaccount=jenkins-slaves:jenkins
rolebinding.rbac.authorization.k8s.io "jenkins" created

# kubectl create clusterrolebinding jenkins --clusterrole cluster-admin --serviceaccount=jenkins-slaves:jenkins
clusterrolebinding.rbac.authorization.k8s.io "jenkins" created

added a Jenkins credential of "secret text" using the token spit out from

# kubectl get -n jenkins-slaves sa/jenkins --template='{{range .secrets}}{{ .name }} {{end}}' | xargs -n 1 kubectl -n jenkins-slaves get secret --template='{{ if .data.token }}{{ .data.token }}{{end}}' | head -n 1 | base64 -d -

a "Test Connection" shows "Connection test successful"

It should be noted that that same token can be used to login to the Kubernetes dashboard with full access rights.

"Jenkins URL": http://172.20.43.30:8080

"Kubernetes Pod Template:Name": jnlp slave

"Kubernetes Pod Template:Namespace": jenkins-slaves

"Kubernetes Pod Template:Labels": jenkins-slaves

"Kubernetes Pod Template:Usage": Only build jobs with label expressions matching this node

"Kubernetes Pod Template:Container Template:Name": jnlp-slave

"Kubernetes Pod Template:Container Template:Docker image": jenkins/jnlp-slave

"Kubernetes Pod Template:Container Template:Working directory": ./.jenkins-agent

At this point, if I create a job and "Restrict where this project can be run" to a "Label Expression" of "jenkins-slaves", I get:

Label jenkins-slaves is serviced by no nodes and 1 cloud. Permissions or other restrictions provided by plugins may prevent this job from running on those nodes.

If I try to build the job, it will sit in the build queue and the "Build Executor Status" will periodically say "jnlp-slave-##### (offline) (suspended)" and then disappear a couple seconds later.

The system log says:

Apr 03, 2018 12:16:21 PM SEVERE org.csanchez.jenkins.plugins.kubernetes.KubernetesLauncher logLastLines
Error in provisioning; agent=KubernetesSlave name: jnlp-slave-t8004, template=PodTemplate{inheritFrom='', name='jnlp slave', namespace='jenkins-slaves', label='jenkins-slaves', nodeSelector='', nodeUsageMode=EXCLUSIVE, workspaceVolume=org.csanchez.jenkins.plugins.kubernetes.volumes.workspace.EmptyDirWorkspaceVolume@44dcba2d, containers=[ContainerTemplate{name='jnlp-slave', image='jenkins/jnlp-slave', workingDir='./.jenkins-agent', command='/bin/sh -c', args='cat', ttyEnabled=true, resourceRequestCpu='', resourceRequestMemory='', resourceLimitCpu='', resourceLimitMemory='', livenessProbe=org.csanchez.jenkins.plugins.kubernetes.ContainerLivenessProbe@58f0ceec}]}. Container jnlp exited with error 255. Logs: Warning: JnlpProtocol3 is disabled by default, use JNLP_PROTOCOL_OPTS to alter the behavior
Warning: SECRET is defined twice in command-line arguments and the environment variable
Warning: AGENT_NAME is defined twice in command-line arguments and the environment variable
Apr 03, 2018 4:16:16 PM hudson.remoting.jnlp.Main createEngine
INFO: Setting up agent: jnlp-slave-t8004
Apr 03, 2018 4:16:16 PM hudson.remoting.jnlp.Main$CuiListener <init>
INFO: Jenkins agent is running in headless mode.
Apr 03, 2018 4:16:16 PM hudson.remoting.Engine startEngine
INFO: Using Remoting version: 3.19
Apr 03, 2018 4:16:16 PM hudson.remoting.Engine startEngine
WARNING: No Working Directory. Using the legacy JAR Cache location: /home/jenkins/.jenkins/cache/jars
Apr 03, 2018 4:16:17 PM hudson.remoting.jnlp.Main$CuiListener status
INFO: Locating server among [http://172.20.43.30:8080/]
Apr 03, 2018 4:16:17 PM hudson.remoting.jnlp.Main$CuiListener error
SEVERE: http://172.20.43.30:8080/tcpSlaveAgentListener/ is invalid: 404 Not Found
java.io.IOException: http://172.20.43.30:8080/tcpSlaveAgentListener/ is invalid: 404 Not Found
    at org.jenkinsci.remoting.engine.JnlpAgentEndpointResolver.resolve(JnlpAgentEndpointResolver.java:197)
    at hudson.remoting.Engine.innerRun(Engine.java:518)
    at hudson.remoting.Engine.run(Engine.java:469)
Apr 03, 2018 12:16:21 PM INFO org.csanchez.jenkins.plugins.kubernetes.KubernetesSlave _terminate
Terminating Kubernetes instance for agent jnlp-slave-t8004
Apr 03, 2018 12:16:21 PM WARNING io.fabric8.kubernetes.client.Config tryServiceAccount
Error reading service account token from: [/var/run/secrets/kubernetes.io/serviceaccount/token]. Ignoring.
Apr 03, 2018 12:16:21 PM INFO okhttp3.internal.platform.Platform log
ALPN callback dropped: HTTP/2 is disabled. Is alpn-boot on the boot class path?
Apr 03, 2018 12:16:21 PM INFO org.csanchez.jenkins.plugins.kubernetes.KubernetesSlave _terminate
Terminated Kubernetes instance for agent jenkins-slaves/jnlp-slave-t8004
Apr 03, 2018 12:16:21 PM INFO org.csanchez.jenkins.plugins.kubernetes.KubernetesSlave _terminate
Disconnected computer jnlp-slave-t8004
Apr 03, 2018 12:16:25 PM INFO org.csanchez.jenkins.plugins.kubernetes.KubernetesCloud provision
Excess workload after pending Kubernetes agents: 1
Apr 03, 2018 12:16:25 PM INFO org.csanchez.jenkins.plugins.kubernetes.KubernetesCloud provision
Template: Kubernetes Pod Template
Apr 03, 2018 12:16:25 PM WARNING io.fabric8.kubernetes.client.Config tryServiceAccount
Error reading service account token from: [/var/run/secrets/kubernetes.io/serviceaccount/token]. Ignoring.
Apr 03, 2018 12:16:25 PM INFO okhttp3.internal.platform.Platform log
ALPN callback dropped: HTTP/2 is disabled. Is alpn-boot on the boot class path?
Apr 03, 2018 12:16:25 PM INFO hudson.slaves.NodeProvisioner$StandardStrategyImpl apply
Started provisioning Kubernetes Pod Template from kubernetes with 1 executors. Remaining excess workload: 0
Apr 03, 2018 12:16:35 PM WARNING io.fabric8.kubernetes.client.Config tryServiceAccount
Error reading service account token from: [/var/run/secrets/kubernetes.io/serviceaccount/token]. Ignoring.
Apr 03, 2018 12:16:35 PM INFO hudson.slaves.NodeProvisioner$2 run
Kubernetes Pod Template provisioning successfully completed. We have now 2 computer(s)
Apr 03, 2018 12:16:35 PM INFO org.csanchez.jenkins.plugins.kubernetes.KubernetesCloud provision
Excess workload after pending Kubernetes agents: 0
Apr 03, 2018 12:16:35 PM INFO org.csanchez.jenkins.plugins.kubernetes.KubernetesCloud provision
Template: Kubernetes Pod Template
Apr 03, 2018 12:16:35 PM INFO okhttp3.internal.platform.Platform log
ALPN callback dropped: HTTP/2 is disabled. Is alpn-boot on the boot class path?
Apr 03, 2018 12:16:35 PM INFO org.csanchez.jenkins.plugins.kubernetes.KubernetesLauncher launch
Created Pod: jnlp-slave-bnz94 in namespace jenkins-slaves
Apr 03, 2018 12:16:35 PM INFO org.csanchez.jenkins.plugins.kubernetes.KubernetesLauncher launch

-Steve Maring

Orlando, FL


回答1:


I went to http://172.20.43.30:8080/configureSecurity/ and set "Agents:TCP port for JNLP agents" to "random"

I then got a "jnlp-slave-ttm5v (suspended)" that stays in the "Build Executor Status"

and the log said:

Container is waiting jnlp-slave-ttm5v [jnlp-slave]: 
ContainerStateWaiting(message=Error response from daemon: the working directory './.jenkins-agent' is invalid, it needs to be an absolute path, reason=CreateContainerError, additionalProperties={})

After setting "Working directory" to "/home/jenkins" I saw a pod actually get created on k8s:

# kubectl get pods --namespace=jenkins-slaves
NAME               READY     STATUS    RESTARTS   AGE
jnlp-slave-1ds27   2/2       Running   0          42s

and my job ran successfully!

Started by user Buildguy
Agent jnlp-slave-1ds27 is provisioned from template Kubernetes Pod Template
Agent specification [Kubernetes Pod Template] (jenkins-slaves): 
* [jnlp-slave] jenkins/jnlp-slave(resourceRequestCpu: , resourceRequestMemory: , resourceLimitCpu: , resourceLimitMemory: )

Building remotely on jnlp-slave-1ds27 (jenkins-slaves) in workspace 
/home/jenkins/workspace/maven-parent-poms


来源:https://stackoverflow.com/questions/49635916/jenkins-kubernetes-plugin-failing-to-provision-jnlp-slave-pods

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!