Rabbitmq集群(单机多实例)

若如初见. 提交于 2019-11-30 03:35:22

Rabbitmq集群

Distributed Rabbitmq brokers的实现方式有三种,分别是clustering、federation、shovel。本节围绕clustering(集群)讲述。

  • 搭建rabbitmq集群要求:
    • 可靠的网络环境;
    • 集群中所有机器的Rabbitmq和Erlang版本要一样。
  • Rabbitmq_Clustering工作模式:
    • Virtual hosts, exchanges, users和permissions会自动镜像到集群的所有节点;
    • queues可以只配置在一个节点或者镜像到多个节点;
    • 客户端连接到集群的任何一个节点都能看到所有的queues。

搭建Rabbitmq集群

搭建Rabbitmq集群的方法有很多种,参考Ways of Forming a Cluster,在此作者使用env variables来搭建集群。

Rabbitmq是通过ip和port来为客户端提供服务的,所以配置Rabbitmq实例的基本要求就是绑定ip:port(默认为localhost:5672),如果单机部署过mysql、Redis等工具,想必这个原理很好理解了。如果不理解请继续看示例:

单机启动多个实例

# 启动第一个节点
$ RABBITMQ_NODE_PORT=5672 RABBITMQ_NODENAME=rabbit1 rabbitmq-server -detached

#启动第二个节点
RABBITMQ_NODE_PORT=5672 RABBITMQ_NODENAME=rabbit2 rabbitmq-server -detached
Warning: PID file not written; -detached was passed.

# 此时查看端口状态会发现第二个节点并没有起来!!!

此处报错,查看日志:

$ less /var/log/rabbitmq/rabbit2.log
Error description:
    init:do_boot/3
    init:start_em/1
    rabbit:start_it/1 line 446
    rabbit:broker_start/0 line 322
    rabbit:start_apps/2 line 542
    app_utils:manage_applications/6 line 126
    lists:foldl/3 line 1263
    rabbit:'-handle_app_error/1-fun-0-'/3 line 638
throw:{could_not_start,rabbitmq_management,
       {rabbitmq_management,
        {bad_return,
         {{rabbit_mgmt_app,start,[normal,[]]},
          {'EXIT',
           {{could_not_start_listener,
             [{port,15672}],
             {shutdown,
              {failed_to_start_child,ranch_acceptors_sup,
               {listen_error,rabbit_web_dispatch_sup_15672,eaddrinuse}}}},
            {gen_server,call,
             [rabbit_web_dispatch_registry,
              {add,rabbit_mgmt,
               [{port,15672}],
               #Fun<rabbit_web_dispatch.0.82427196>,
               [{'_',[],
                 [{[],[],cowboy_static,
                   {priv_file,rabbitmq_management,"www/index.html"}},
                  {[<<"api">>,<<"overview">>],[],rabbit_mgmt_wm_overview,[]},
                  {[<<"api">>,<<"cluster-name">>],
                   [],rabbit_mgmt_wm_cluster_name,[]},
                  {[<<"api">>,<<"nodes">>],[],rabbit_mgmt_wm_nodes,[]},
                  {[<<"api">>,<<"nodes">>,node],[],rabbit_mgmt_wm_node,[]},

总的来说就是Rabbitmq_management启动失败,查资料后原因如下:web管理插件端口占用,所以还要指定其web插件占用的端口号。

# 更改参数后启动
$ RABBITMQ_NODE_PORT=5673 RABBITMQ_SERVER_START_ARGS="-rabbitmq_management listener [{port,15673}]" RABBITMQ_NODENAME=rabbit2 rabbitmq-server -detached

# 查看端口状态
$ netstat -lntp
tcp        0      0 0.0.0.0:15672           0.0.0.0:*               LISTEN      10253/beam.smp      
tcp        0      0 0.0.0.0:15673           0.0.0.0:*               LISTEN      13922/beam.smp      
tcp        0      0 127.0.0.1:9797          0.0.0.0:*               LISTEN      632/python2         
tcp        0      0 0.0.0.0:25672           0.0.0.0:*               LISTEN      10253/beam.smp      
tcp        0      0 0.0.0.0:25673           0.0.0.0:*               LISTEN      13922/beam.smp      
tcp6       0      0 :::4369                 :::*                    LISTEN      10150/epmd          
tcp6       0      0 :::5672                 :::*                    LISTEN      10253/beam.smp      
tcp6       0      0 :::5673                 :::*                    LISTEN      13922/beam.smp 
# rabbit1、rabbit2启动成功

# 启动第三个节点
RABBITMQ_NODE_PORT=5674 RABBITMQ_SERVER_START_ARGS="-rabbitmq_management listener [{port,15674}]" RABBITMQ_NODENAME=rabbit3 rabbitmq-server -detached

现在三个节点都已启动,状态:

$ netstat -lntp
tcp        0      0 0.0.0.0:4369            0.0.0.0:*               LISTEN      10150/epmd          
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      773/sshd            
tcp        0      0 0.0.0.0:15672           0.0.0.0:*               LISTEN      10253/beam.smp      
tcp        0      0 0.0.0.0:15673           0.0.0.0:*               LISTEN      13922/beam.smp      
tcp        0      0 0.0.0.0:15674           0.0.0.0:*               LISTEN      14910/beam.smp      
tcp        0      0 127.0.0.1:9797          0.0.0.0:*               LISTEN      632/python2         
tcp        0      0 0.0.0.0:25672           0.0.0.0:*               LISTEN      10253/beam.smp      
tcp        0      0 0.0.0.0:25673           0.0.0.0:*               LISTEN      13922/beam.smp      
tcp        0      0 0.0.0.0:25674           0.0.0.0:*               LISTEN      14910/beam.smp      
tcp6       0      0 :::4369                 :::*                    LISTEN      10150/epmd          
tcp6       0      0 :::22                   :::*                    LISTEN      773/sshd            
tcp6       0      0 :::5672                 :::*                    LISTEN      10253/beam.smp      
tcp6       0      0 :::5673                 :::*                    LISTEN      13922/beam.smp      
tcp6       0      0 :::5674                 :::*                    LISTEN      14910/beam.smp 

搭建集群

我把rabbit1作为主节点,剩下两个设置为子节点(主节点不动,配置两个子节点即可)。

  • 将rabbit2加入集群:

    $ rabbitmqctl -n rabbit2 stop_app
    
    $ rabbitmqctl -n rabbit2 reset
    
    $ rabbitmqctl -n rabbit2 join_cluster rabbit1@`hostname -s`
    
    $ rabbitmqctl -n rabbit2 start_app
    
  • 将rabbit3加入集群(同理):

    $ rabbitmqctl -n rabbit3 stop_app
    
    $ rabbitmqctl -n rabbit3 reset
    
    $ rabbitmqctl -n rabbit3 join_cluster rabbit1@`hostname -s`
    
    $ rabbitmqctl -n rabbit3 start_app
    
  • 查看集群状态:

    $ rabbitmqctl cluster_status -n rabbit1@host3
    Cluster status of node rabbit1@host3 ...
    [{nodes,[{disc,[rabbit1@host3,rabbit2@host3,rabbit3@host3]}]},
     {running_nodes,[rabbit3@host3,rabbit2@host3,rabbit1@host3]},
     {cluster_name,<<"rabbit1@host3">>},
     {partitions,[]},
     {alarms,[{rabbit3@host3,[]},{rabbit2@host3,[]},{rabbit1@host3,[]}]}]
    
  • 在UI_Management页面查看集群状态(server_ip:port,在此可以通过15672、15673、15674任何一个端口进行访问):

  • 如果想添加新的节点,只需要执行本节操作步骤即可!

  • 删除节点:

    $ rabbitmqctl forget_cluster_node [--offline] <existing_cluster_member_node>   
    # 测试未成功
    
  • 参考:

双机搭建集群

基本配置项说明

集群配置

官方文档

集群

环境说明

本次用两个节点搭建rabbitmq集群:

主机 系统 Rabbitmq-server版本 节点
host1 Centos 7.2 3.7.9 node1
host2 Centos 7.2 3.7.9 node2

搭建集群

  • 启动node1:

    $ systemctl start  rabbitmq-server
    
  • 启动node2:

    # 启动前需要先将node1的erlang_cookie拷贝到node2,保持一致
    $ cat /var/lib/rabbitmq/.erlang.cookie
    $ systemctl start  rabbitmq-server
    

    erlang.cookie是erlang实现分布式的必要文件,erlang分布式的每个节点上要保持相同的.erlang.cookie文件,同时保证文件的权限是400。

  • 将node2 加入到node1节点,node2需要执行以下操作:

    • reset:目的是清除节点上的历史数据(如果不清除,无法将节点加入到集群)

      $ rabbitmqctl stop_app
      $ rabbitmqctl reset
      
    • join

      $ rabbitmqctl join_cluster rabbit@redis01
      Clustering node rabbit@infra01 with rabbit@redis01
      
      $ rabbitmqctl start_app
      
      # 查看集群状态
      $ rabbitmqctl cluster_status 
      Cluster status of node rabbit@infra01 ...
      [{nodes,[{disc,[rabbit@infra01,rabbit@redis01]}]},
       {running_nodes,[rabbit@redis01,rabbit@infra01]},
       {cluster_name,<<"rabbit@redis01">>},
       {partitions,[]},
       {alarms,[{rabbit@redis01,[]},{rabbit@infra01,[]}]}]
      
  • rabbitmqctl命令:http://www.rabbitmq.com/rabbitmqctl.8.html

  • rabbitmqadmin:https://rabbit-new.chunyu.me/cli/index.html

  • 加入开机启动: systemctl enable rabbitmq-server

  • 深入理解集群:http://www.rabbitmq.com/cluster-formation.html

爬坑过程:

开始搭建集群的时候报错:

Clustering node rabbit@host2 with rabbit@host1
Error: unable to perform an operation on node 'rabbit@redis01'. Please see diagnostics information and suggestions below.

Most common reasons for this are:

 * Target node is unreachable (e.g. due to hostname resolution, TCP connection or firewall issues)
 * CLI tool fails to authenticate with the server (e.g. due to CLI tool's Erlang cookie not matching that of the server)
 * Target node is not running

In addition to the diagnostics info below:

 * See the CLI, clustering and networking guides on http://rabbitmq.com/documentation.html to learn more
 * Consult server logs on node rabbit@redis01

DIAGNOSTICS
===========

attempted to contact: [rabbit@host1]

rabbit@host1:
  * connected to epmd (port 4369) on host1
  * epmd reports node 'rabbit' uses port 25672 for inter-node and CLI tool traffic 
  * TCP connection succeeded but Erlang distribution failed 

  * Authentication failed (rejected by the remote node), please check the Erlang cookie


Current node details:
 * node name: 'rabbitmqcli-23827-rabbit@host2'
 * effective user's home directory: /var/lib/rabbitmq
 * Erlang cookie hash: t9ttNYffM0xwbMi8k2DA4w==

报错原因:node1节点和node2节点的erlang.cookie不一致
解决办法:各个节点统一使用node1节点的erlang.cookie(文档中已说明)

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!