How do I output the results of a HiveQL query to CSV?

前端 未结 18 1682
独厮守ぢ
独厮守ぢ 2020-11-27 10:11

we would like to put the results of a Hive query to a CSV file. I thought the command should look like this:

insert overwrite directory \'/home/output.csv\'          


        
18条回答
  •  执笔经年
    2020-11-27 10:44

    I was looking for a similar solution, but the ones mentioned here would not work. My data had all variations of whitespace (space, newline, tab) chars and commas.

    To make the column data tsv safe, I replaced all \t chars in the column data with a space, and executed python code on the commandline to generate a csv file, as shown below:

    hive -e 'tab_replaced_hql_query' |  python -c 'exec("import sys;import csv;reader = csv.reader(sys.stdin, dialect=csv.excel_tab);writer = csv.writer(sys.stdout, dialect=csv.excel)\nfor row in reader: writer.writerow(row)")'
    

    This created a perfectly valid csv. Hope this helps those who come looking for this solution.

提交回复
热议问题