Any command to get active namenode for nameservice in hadoop?

后端 未结 10 1435
情歌与酒
情歌与酒 2020-12-13 22:08

The command:

hdfs haadmin -getServiceState machine-98

Works only if you know the machine name. Is there any command like:

h         


        
相关标签:
10条回答
  • 2020-12-13 22:21

    I found the below when i simply typed 'hdfs' and found a couple of helpful commands, which could be useful for someone who could maybe come here seeking for help.

    hdfs getconf -namenodes
    

    This above command will give you the service id of the namenode. Say, hn1.hadoop.com

    hdfs getconf -secondaryNameNodes
    

    This above command will give you the service id of the available secondary namenodes. Say , hn2.hadoop.com

    hdfs getconf -backupNodes
    

    This above command will get you the service id of backup nodes, if any.

    hdfs getconf -nnRpcAddresses
    

    This above command will give you info of name service id along with rpc port number. Say, hn1.hadoop.com:8020

                                      You're Welcome :)
    
    0 讨论(0)
  • 2020-12-13 22:28

    In a High Availability Hadoop cluster, there will be 2 namenodes - one active and one standby.

    To find the active namenode, we can try executing the test hdfs command on each of the namenodes and find the active name node corresponding to the successful run.

    Below command executes successfully if the name node is active and fails if it is a standby node.

    hadoop fs -test -e hdfs://<Name node>/
    

    Unix script

    active_node=''
    if hadoop fs -test -e hdfs://<NameNode-1>/ ; then
    active_node='<NameNode-1>'
    elif hadoop fs -test -e hdfs://<NameNode-2>/ ; then
    active_node='<NameNode-2>'
    fi
    
    echo "Active Dev Name node : $active_node"
    
    0 讨论(0)
  • 2020-12-13 22:31

    From java api, you can use HAUtil.getAddressOfActive(fileSystem).

    0 讨论(0)
  • 2020-12-13 22:34

    You can do a curl command to find out the Active and secondary Namenode for example

    curl -u username -H "X-Requested-By: ambari" -X GET http://cluster-hostname:8080/api/v1/clusters//services/HDFS

    Regards

    0 讨论(0)
  • 2020-12-13 22:37

    Found this:

    https://gist.github.com/cnauroth/7ff52e9f80e7d856ddb3

    This works out of the box on my CDH5 namenodes, although I'm not sure other hadoop distributions will have http://namenode:50070/jmx available - if not, I think it can be added by deploying Jolokia.

    Example:

    curl 'http://namenode1.example.com:50070/jmx?qry=Hadoop:service=NameNode,name=NameNodeStatus'
    {
      "beans" : [ {
        "name" : "Hadoop:service=NameNode,name=NameNodeStatus",
        "modelerType" : "org.apache.hadoop.hdfs.server.namenode.NameNode",
        "State" : "active",
        "NNRole" : "NameNode",
        "HostAndPort" : "namenode1.example.com:8020",
        "SecurityEnabled" : true,
        "LastHATransitionTime" : 1436283324548
      } ]
    

    So by firing off one http request to each namenode (this should be quick) we can figure out which one is the active one.

    It's also worth noting that if you talk WebHDFS REST API to an inactive namenode you will get a 403 Forbidden and the following JSON:

    {"RemoteException":{"exception":"StandbyException","javaClassName":"org.apache.hadoop.ipc.StandbyException","message":"Operation category READ is not supported in state standby"}}
    
    0 讨论(0)
  • 2020-12-13 22:41

    After reading all the existing answers none seemed to combine the three steps of:

    1. Identifying the namenodes from the cluster.
    2. Resolving the node names to host:port.
    3. Checking the status of each node (without requiring cluster admin privs).

    Solution below combines hdfs getconf calls and JMX service call for node status.

    #!/usr/bin/env python
    
    from subprocess import check_output
    import urllib, json, sys
    
    def get_name_nodes(clusterName):
        ha_ns_nodes=check_output(['hdfs', 'getconf', '-confKey',
            'dfs.ha.namenodes.' + clusterName])
        nodes = ha_ns_nodes.strip().split(',')
        nodeHosts = []
        for n in nodes:
            nodeHosts.append(get_node_hostport(clusterName, n))
    
        return nodeHosts
    
    def get_node_hostport(clusterName, nodename):
        hostPort=check_output(
            ['hdfs','getconf','-confKey',
             'dfs.namenode.rpc-address.{0}.{1}'.format(clusterName, nodename)])
        return hostPort.strip()
    
    def is_node_active(nn):
        jmxPort = 50070
        host, port = nn.split(':')
        url = "http://{0}:{1}/jmx?qry=Hadoop:service=NameNode,name=NameNodeStatus".format(
                host, jmxPort)
        nnstatus = urllib.urlopen(url)
        parsed = json.load(nnstatus)
    
        return parsed.get('beans', [{}])[0].get('State', '') == 'active'
    
    def get_active_namenode(clusterName):
        for n in get_name_nodes(clusterName):
            if is_node_active(n):
                return n
    
    clusterName = (sys.argv[1] if len(sys.argv) > 1 else None)
    if not clusterName:
        raise Exception("Specify cluster name.")
    
    print 'Cluster: {0}'.format(clusterName)
    print "Nodes: {0}".format(get_name_nodes(clusterName))
    print "Active Name Node: {0}".format(get_active_namenode(clusterName))
    
    0 讨论(0)
提交回复
热议问题