Docker1.12 Worker not able to join in cluster(Swarm: Pending)

匿名 (未验证) 提交于 2019-12-03 09:05:37

问题:

Manager Version Docker version 1.12.0-rc5, build a3f2063,

Worker version Docker version 1.12.0-rc5, build a3f2063.

Created Swarm manger:

docker swarm init --advertise-addr "172.25.30.2:4243"      Swarm initialized: current node (3kmewyb10p8xj3ke5rpjyw4s8) is now a manager.      To add a worker to this swarm, run the following command:         docker swarm join \         --token SWMTKN-1-5lwzvv7au6hosiqqmdwmcxvmlmhtz4ts04jsg06284fq3posn0-enq26dqnwma38ij48hymtnioq \         172.25.30.2:4243      To add a manager to this swarm, run the following command:         docker swarm join \         --token SWMTKN-1-5lwzvv7au6hosiqqmdwmcxvmlmhtz4ts04jsg06284fq3posn0-85cwe5pf779qw0knjn6wxdbim \         172.25.30.2:4243 

Then created worker

docker swarm join --token SWMTKN-1-5lwzvv7au6hosiqqmdwmcxvmlmhtz4ts04jsg06284fq3posn0-enq26dqnwma38ij48hymtnioq 172.25.30.2:4243     Error response from daemon: Timeout was reached before node was joined. Attempt to join the cluster will continue in the background. Use "docker info" command to see the current swarm status of your node. 

I have checked logs in worker

time="2016-08-01T00:22:47.449844174-07:00" level=warning msg="failed to retrieve remote root CA certificate: rpc error: code = 1 desc = context canceled"  time="2016-08-01T00:22:47.449962215-07:00" level=warning msg="failed to retrieve remote root CA certificate: rpc error: code = 1 desc = context canceled"  time="2016-08-01T00:22:47.450025342-07:00" level=warning msg="failed to retrieve remote root CA certificate: rpc error: code = 1 desc = context canceled"  time="2016-08-01T00:22:47.450081950-07:00" level=warning msg="failed to retrieve remote root CA certificate: rpc error: code = 1 desc = context canceled"  time="2016-08-01T00:22:47.450142443-07:00" level=warning msg="failed to retrieve remote root CA certificate: rpc error: code = 1 desc = context canceled"  time="2016-08-01T00:22:47.450202836-07:00" level=error msg="cluster exited with error: rpc error: code = 1 desc = context canceled"  time="2016-08-01T00:23:31.351868722-07:00" level=error msg="Handler for POST /v1.24/swarm/join returned error: Timeout was reached before node was joined. Attempt to join the cluster will continue in the background. Use \"docker info\" command to see the current swarm status of your node." 

In docker info, I saw "Swarm: Pending"

I did docker swarm update also!. Still, the worker was not able to join the cluster. So, how can I reslove

UPDATE-1

Uninstalled & removed config files and then install docker 1.12 again with version Docker version 1.12.0, build 8eab29e.

Still facing the same problem(Not able to join and "Swarm:Pending" in docker info) with DIFFERENT error in /var/logs/upstat/docker.logs

time="2016-08-01T11:22:08.629760770-07:00" level=error msg="Handler for POST /v1.24/swarm/join returned error: Timeout was reached before node was joined. Attempt to join the cluster will continue in the background. Use \"docker info\" command to see the current swarm status of your node." 

Thanks.

回答1:

The thing is, I was trying to join with wrong "port" (As docker swarm init shown in output).

1) Before "docker swarm init", the docker running on port "4243" only. I have checked with netstat -tulp | grep docker. So I advertised with that port!

root@veeru:~# netstat -tulpn | grep docker tcp6       0      0 :::4243                 :::*                    LISTEN      8750/dockerd   root@veeru:~# docker swarm init --advertise-addr "172.25.30.2:4243" Swarm initialized: current node (exvwgj0pu4cd124ljnblt9xff) is now a manager.  To add a worker to this swarm, run the following command:     docker swarm join \     --token SWMTKN-1-5j9mpo8hepue6g1sjdas33thr92w1o9hlef5auwqpbxs3glt39-6zomhgu204m9alq51f632nzas \     172.25.30.2:4243  To add a manager to this swarm, run the following command:     docker swarm join \     --token SWMTKN-1-5j9mpo8hepue6g1sjdas33thr92w1o9hlef5auwqpbxs3glt39-axhgqgo4jqw4hv38x578m44wh \     172.25.30.2:4243 

2) After docker swarm init, the docker is running with 4 port including the port 2377(netstat -tupln | grep docker).

root@veeru:~# netstat -tulp | grep docker tcp6       0      0 [::]:2377               [::]:*                  LISTEN      8750/dockerd     tcp6       0      0 [::]:7946               [::]:*                  LISTEN      8750/dockerd     tcp6       0      0 [::]:4243               [::]:*                  LISTEN      8750/dockerd     udp6       0      0 [::]:7946               [::]:*                              8750/dockerd 

In point 1, it is telling to run docker swarm join with port 4243 in worker. Previously I did run like that!.(It wont work!)

Later I did docker swarm leave and joined with port 2377. Now I am able to join!

docker swarm join --token SWMTKN-1-5j9mpo8hepue6g1sjdas33thr92w1o9hlef5auwqpbxs3glt39-6zomhgu204m9alq51f632nzas 172.25.30.2:2377 


回答2:

I was facing similar issue, While in my case port was getting blocked due to firewall rule.



回答3:

For me it was a firewall issue too.

  1. I tried to ping to the manager node and was pinging back

  2. Checked if the ports are opening using telnet and was not able to connect and figured out it was the port issue.

If you are running Centos than the port can be easily opened using the firewalld

Check if the firewalld is running

sudo firewall-cmd --state 

Opening the port you want

sudo firewall-cmd --zone=public --add-port=2377/tcp 

Change the port as per your node ports it is trying to connect to.



回答4:

I was having the same issue. I was running coreos vms in Azure. I found out that all my vms had the same private ip address and different public ip addresses. This usually happens when the vms are part of the same security group, however it was not the case this time. The issue was the my account had reached the max number of resources, so I deleted the resources such as ip addresses, nsg, networks etc and then re-provisioned new vms, they had different private ips and when ran the command everything was fine. My docker version is 1.12.6



标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!