Empty String is not treated as null in Hive

江枫思渺然 提交于 2019-12-03 20:29:46

问题


My understanding of the following statement is that if blank or empty string is inserted into hive column, it will be treated as null.

TBLPROPERTIES('serialization.null.format'=''

To test the functionality i have created a table and insertted '' to the filed 3. When i query for nulls on the field3, there are no rows with that criteria.

Is my understanding of making blank string to null correct??

CREATE TABLE CDR
(
field1                 string,
field2                 string,
field3                 string
)
ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LINES TERMINATED BY '\n'
**TBLPROPERTIES('serialization.null.format'='');**

insert overwrite table emmtest.cdr select **field1,field2,''** from emmtest.cdr_non_orc;

select * from emmtest.cdr where **field3 is null;**

The last statement has not returned any rows. But i am expecting all rows to be returned since there is blank string in field3.


回答1:


TBLPROPERTIES('serialization.null.format'='') means the following:

  • An empty field in the data files will be treated as NULL when you query the table
  • When inserting rows to the table, NULL values will be written to the data files as empty fields

You are doing something else -
You are inserting an empty string to a table from a query.
It is treated "as is" - an empty string.

Demo

bash

hdfs dfs -mkdir /user/hive/warehouse/mytable
echo Hello,,World | hdfs dfs -put - /user/hive/warehouse/mytable/data.txt

hive

create table mytable (s1 string,s2 string,s3 string) 
row format delimited 
fields terminated by ','
;

hive> select * from mytable;
OK
s1  s2  s3
Hello       World

hive> alter table mytable set tblproperties ('serialization.null.format'='');
OK

hive> select * from mytable;
OK
s1  s2  s3
Hello   NULL    World


来源:https://stackoverflow.com/questions/43263038/empty-string-is-not-treated-as-null-in-hive

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!