问题
I'm using python and pyodbc module in particular to execute Hive queries on Hadoop. The portion of code triggering issue is like this:
import pyodbc
import pandas
oConnexionString = 'Driver={ClouderaHive};[...]'
oConnexion = pyodbc.connect(oConnexionString, autocommit=True)
oConnexion.setencoding(encoding='utf-8')
oQueryParameter = "select * from my_db.my_table;"
oParameterData = pandas.read_sql(oQueryParameter, oConnexion)
oCursor = oConnexion.cursor()
for oRow in oParameterData.index:
sTableName = oParameterData.loc[oRow,'TableName']
oQueryDeleteTable = 'drop table if exists my_db.' + sTableName + ';'
print(oQueryDeleteTable)
oCursor.execute(oQueryDeleteTable)
The print gives this: drop table if exists dl_audit_data_quality.hero_context_start_gamemode;
But the cursor.execute
triggers the following error message
pyodbc.Error: ('HY000', "[HY000] [Cloudera][HiveODBC] (80) Syntax or semantic analysis error thrown in server while execurint query. Error message from server: Error while compiling statement: FAILED: ParseException line 1:44 character ' (80) (SQLExecDirectW)")
Note that when I copy the print and execute it manually in Hue, it works well. I am guessing it has something to do with the encoding of the variable sTableName
but I can't figure out how to fix it.
Thanks
回答1:
The query was failing due to incorrect encoding of the variable sTableName
.
Printing the variable alone would display the text properly. Example with the print above:
>>> print(oQueryDeleteTable)
>>> 'drop table if exists dl_audit_data_quality.hero_context_start_gamemode;'
But printing the original data frame showed it contained characters like this:
>>> print(oParameterData.loc[oRow,'TableName']
>>> 'h\x00e\x00r\x00o\x00_c\x00o\x00n\x00t\x00e\x00x\x00t\x00'
Issue was solved by reworking on the encoding as described here: Python Dictionary Contains Encoded Values
import pyodbc
import pandas
oConnexionString = 'Driver={ClouderaHive};[...]'
oConnexion = pyodbc.connect(oConnexionString, autocommit=True)
oConnexion.setdecoding(pyodbc.SQL_CHAR, encoding='utf-8')
oConnexion.setdecoding(pyodbc.SQL_WCHAR, encoding='utf-8')
oConnexion.setencoding(encoding='utf-8')
oQueryParameter = "select * from my_db.my_table;"
oParameterData = pandas.read_sql(oQueryParameter, oConnexion)
oCursor = oConnexion.cursor()
for oRow in oParameterData.index:
sTableName = oParameterData.loc[oRow,'TableName']
oQueryDeleteTable = 'drop table if exists my_db.' + sTableName + ';'
print(oQueryDeleteTable)
oCursor.execute(oQueryDeleteTable)
来源:https://stackoverflow.com/questions/43821098/hive-parseexception-in-drop-table-statement