Access tables from Impala through Python

不羁的心 提交于 2019-12-11 04:05:17

问题


I need to access tables from Impala through CLI using python on the same cloudera server

I have tried below code to establish the connection :

def query_impala(sql):
    cursor = query_impala_cursor(sql)
    result = cursor.fetchall()
    field_names = [f[0] for f in cursor.description]
    return result, field_names


def query_impala_cursor(sql, params=None):
    conn = connect(host='xx.xx.xx.xx', port=21050, database='am_playbook',user='xxxxxxxx', password='xxxxxxxx')
    cursor = conn.cursor()
    cursor.execute(sql.encode('utf-8'), params)
    return cursor

but since I am on the same cloudera server, I will not need to provide the host name. Could you please provide the correct code to access Impala/hive tables existing on the same server through python.


回答1:


you can use pyhive to make connection to hive and get access to your hive tables.

from pyhive import hive
import pandas as pd
import datetime

conn = hive.Connection(host="hostname", port=10000, username="XXXX")
hive.connect('hostname', configuration={'hive.execution.engine':'tez'})
query="select col1,col2,col3,col4 from db.yourhiveTable"

start_time= datetime.datetime.now()

data=pd.read_sql(query,conn)
print(data)

end_time=datetime.datetime.now()
print 'Finished reading from Hive table', (start_time-end_time).seconds/60.0,' minutes'


来源:https://stackoverflow.com/questions/57157942/access-tables-from-impala-through-python

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!