How to Access Hive via Python?

后端 未结 16 854
小蘑菇
小蘑菇 2020-11-30 17:11

https://cwiki.apache.org/confluence/display/Hive/HiveClient#HiveClient-Python appears to be outdated.

When I add this to /etc/profile:

export PYTHONP         


        
16条回答
  •  抹茶落季
    2020-11-30 17:35

    The examples above are a bit out of date. One new example is here:

    import pyhs2 as hive
    import getpass
    DEFAULT_DB = 'default'
    DEFAULT_SERVER = '10.37.40.1'
    DEFAULT_PORT = 10000
    DEFAULT_DOMAIN = 'PAM01-PRD01.IBM.COM'
    
    u = raw_input('Enter PAM username: ')
    s = getpass.getpass()
    connection = hive.connect(host=DEFAULT_SERVER, port= DEFAULT_PORT, authMechanism='LDAP', user=u + '@' + DEFAULT_DOMAIN, password=s)
    statement = "select * from user_yuti.Temp_CredCard where pir_post_dt = '2014-05-01' limit 100"
    cur = connection.cursor()
    
    cur.execute(statement)
    df = cur.fetchall() 
    

    In addition to the standard python program, a few libraries need to be installed to allow Python to build the connection to the Hadoop databae.

    1.Pyhs2, Python Hive Server 2 Client Driver

    2.Sasl, Cyrus-SASL bindings for Python

    3.Thrift, Python bindings for the Apache Thrift RPC system

    4.PyHive, Python interface to Hive

    Remember to change the permission of the executable

    chmod +x test_hive2.py ./test_hive2.py

    Wish it helps you. Reference: https://sites.google.com/site/tingyusz/home/blogs/hiveinpython

提交回复
热议问题