we would like to put the results of a Hive query to a CSV file. I thought the command should look like this:
insert overwrite directory \'/home/output.csv\'
I was looking for a similar solution, but the ones mentioned here would not work. My data had all variations of whitespace (space, newline, tab) chars and commas.
To make the column data tsv safe, I replaced all \t chars in the column data with a space, and executed python code on the commandline to generate a csv file, as shown below:
hive -e 'tab_replaced_hql_query' | python -c 'exec("import sys;import csv;reader = csv.reader(sys.stdin, dialect=csv.excel_tab);writer = csv.writer(sys.stdout, dialect=csv.excel)\nfor row in reader: writer.writerow(row)")'
This created a perfectly valid csv. Hope this helps those who come looking for this solution.