Hive ParseException in Drop Table Statement

一世执手 提交于 2020-01-05 06:55:10

问题


I'm using python and pyodbc module in particular to execute Hive queries on Hadoop. The portion of code triggering issue is like this:

import pyodbc
import pandas

oConnexionString = 'Driver={ClouderaHive};[...]'
oConnexion = pyodbc.connect(oConnexionString, autocommit=True)
oConnexion.setencoding(encoding='utf-8')
oQueryParameter = "select * from my_db.my_table;"
oParameterData = pandas.read_sql(oQueryParameter, oConnexion)
oCursor = oConnexion.cursor()

for oRow in oParameterData.index:
    sTableName = oParameterData.loc[oRow,'TableName']
    oQueryDeleteTable = 'drop table if exists my_db.' + sTableName + ';'
    print(oQueryDeleteTable)
    oCursor.execute(oQueryDeleteTable)

The print gives this: drop table if exists dl_audit_data_quality.hero_context_start_gamemode;

But the cursor.execute triggers the following error message

pyodbc.Error: ('HY000', "[HY000] [Cloudera][HiveODBC] (80) Syntax or semantic analysis error thrown in server while execurint query. Error message from server: Error while compiling statement: FAILED: ParseException line 1:44 character ' (80) (SQLExecDirectW)")

Note that when I copy the print and execute it manually in Hue, it works well. I am guessing it has something to do with the encoding of the variable sTableName but I can't figure out how to fix it.

Thanks


回答1:


The query was failing due to incorrect encoding of the variable sTableName. Printing the variable alone would display the text properly. Example with the print above:

>>> print(oQueryDeleteTable)
>>> 'drop table if exists dl_audit_data_quality.hero_context_start_gamemode;'

But printing the original data frame showed it contained characters like this:

>>> print(oParameterData.loc[oRow,'TableName']
>>> 'h\x00e\x00r\x00o\x00_c\x00o\x00n\x00t\x00e\x00x\x00t\x00'

Issue was solved by reworking on the encoding as described here: Python Dictionary Contains Encoded Values

import pyodbc
import pandas

oConnexionString = 'Driver={ClouderaHive};[...]'
oConnexion = pyodbc.connect(oConnexionString, autocommit=True)
oConnexion.setdecoding(pyodbc.SQL_CHAR, encoding='utf-8')
oConnexion.setdecoding(pyodbc.SQL_WCHAR, encoding='utf-8')
oConnexion.setencoding(encoding='utf-8')
oQueryParameter = "select * from my_db.my_table;"
oParameterData = pandas.read_sql(oQueryParameter, oConnexion)
oCursor = oConnexion.cursor()

for oRow in oParameterData.index:
    sTableName = oParameterData.loc[oRow,'TableName']
    oQueryDeleteTable = 'drop table if exists my_db.' + sTableName + ';'
    print(oQueryDeleteTable)
    oCursor.execute(oQueryDeleteTable)


来源:https://stackoverflow.com/questions/43821098/hive-parseexception-in-drop-table-statement

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!