Writing columns having NULL as some string using OpenCSVSerde - HIVE

北城以北 提交于 2021-02-18 17:47:15

问题


I'm using 'org.apache.hadoop.hive.serde2.OpenCSVSerde' to write hive table data.

CREATE TABLE testtable ROW FORMAT SERDE  'org.apache.hadoop.hive.serde2.OpenCSVSerde'
WITH SERDEPROPERTIES (
  "separatorChar" = ","
  "quoteChar"     = "'"
   )  
STORED AS TEXTFILE LOCATION '<location>' AS
select * from foo;

So, if 'foo' table has empty strings in it, for eg: '1','2','' . The empty strings are written as is to the textfile. The data in textfile reads '1','2',''

But if 'foo' contains null values, for eg: '1','2',null. The null value is not written in the text file. The data in the textfile reads '1','2',

How do I make sure that the nulls are properly written to the textfile using csv serde. Either written as empty strings or any other string say "nullstring"?

I also tried this:

CREATE TABLE testtable ROW FORMAT SERDE
....
....  
STORED AS TEXTFILE LOCATION '<location>'
TBLPROPERTIES ('serialization.null.format'='')
AS select * foo;

Though this should probably replace the empty strings with null. But this doesn't even do that.

Please guide me on how to write nulls to csv files.

Will I have to check for the null values for columns in the select query itself like (NVL or something) and replace it with something?


回答1:


Open CSV Serde ignores 'serialization.null.format' property , you can handle null values using below steps

1. CREATE TABLE testtable 
    (
    name string,
    title string,
    birth_year string
    )ROW FORMAT SERDE  'org.apache.hadoop.hive.serde2.OpenCSVSerde'
    WITH SERDEPROPERTIES (
    "separatorChar" = ","
    ,"quoteChar"     = "'"
    )
    STORED AS TEXTFILE;

2. load data into testtable

3. CREATE  TABLE testtable1
(
name string,
title string,
birth_year string
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ','
TBLPROPERTIES('serialization.null.format'='');

4. INSERT OVERWRITE TABLE testtable1 SELECT * FROM testtable


来源:https://stackoverflow.com/questions/43058435/writing-columns-having-null-as-some-string-using-opencsvserde-hive

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!