Export table enclosing values with quotes to local csv in hive

我与影子孤独终老i 提交于 2020-01-01 07:09:02

问题


I am trying to export a table to a local csv file in hive.

INSERT OVERWRITE LOCAL DIRECTORY '/home/sofia/temp.csv' 
ROW FORMAT DELIMITED 
FIELDS TERMINATED BY ',' 
ESCAPED BY '\\' 
LINES TERMINATED BY '\n'
select * from mytable;

The problem is that some of the values contain the newline "\n" character and the resulting file becomes really messy.

Is there any way of enclosing the values in quotes when exporting in Hive, so that the csv file can contain special characters (and especially the newline)?


回答1:


One possible solution could be to use Hive CSV SerDe (Serializer/Deserializer). It provides a way to specify custom delimiters, quote, and escape characters.

Limitation:

It does not handle embedded newlines

Availability:

The CSV Serde is available in Hive 0.14 and greater.

Background:

The CSV SerDe is based from https://github.com/ogrodnek/csv-serde, and was added to the Hive distribution in HIVE-7777.

Usage:

This SerDe works for most CSV data, but does not handle embedded newlines. To use the SerDe, specify the fully qualified class name org.apache.hadoop.hive.serde2.OpenCSVSerde.

original documentation is available at https://github.com/ogrodnek/csv-serde.

CREATE TABLE my_table(a string, b string, ...)
ROW FORMAT SERDE 'org.apache.hadoop.hive.serde2.OpenCSVSerde'
WITH SERDEPROPERTIES (
   "separatorChar" = "\t",
   "quoteChar"     = "'",
   "escapeChar"    = "\\"
)  
STORED AS TEXTFILE;

Default separator, quote, and escape characters if unspecified

DEFAULT_ESCAPE_CHARACTER \
DEFAULT_QUOTE_CHARACTER  "
DEFAULT_SEPARATOR        ,

Reference: Hive csv-serde



来源:https://stackoverflow.com/questions/33254183/export-table-enclosing-values-with-quotes-to-local-csv-in-hive

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!