问题
As mentioned in post Using the Icelandic Thorn character as a delimiter in Hive The thorn character delimiter is not recognized in Hive
Sample table
CREATE EXTERNAL TABLE IF NOT EXISTS zzzzz_raw (
spot_id INT,
activity_type_id INT,
activity_type STRING,
activity_id INT,
activity_sub_type STRING,
report_name STRING,
tag_method_id INT
)
PARTITIONED BY ( dt DATE )
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\-2' LINES TERMINATED BY '\n'
STORED AS TEXTFILE
LOCATION '/raw/data/networkmatchtablesactivity/activity_cat';
Output
select * from activity_cat_raw limit 1;
4552126þ805759þeaasv101þ2275868þbfeaac01þBF_EA Access_Info Pageþ2 NULL NULL NULL NULL NULL NULL 2015-03-24
Am I missing something?
回答1:
I found the answer. Instead of '-2' (thorn delimiter) , i used '-61' delimiter then a substring to remove the additional symbol, something like below
CREATE EXTERNAL TABLE IF NOT EXISTS SSSSSS (
spot_id STRING,
activity_type_id STRING,
activity_type STRING,
activity_id STRING,
activity_sub_type STRING,
report_name STRING,
tag_method_id STRING
)
PARTITIONED BY ( dt STRING )
ROW FORMAT DELIMITED FIELDS TERMINATED BY '\-61' LINES TERMINATED BY '\n'
STORED AS TEXTFILE
LOCATION 'SSSSSS';
and then use substring to remove other symbols
INSERT OVERWRITE TABLE vvvvvv PARTITION (dt)
SELECT spot_id STRING,
substr(activity_type_id,2),
dt
FROM SSSSS
Hope it helps..
来源:https://stackoverflow.com/questions/30245214/thorn-character-delimiter-is-not-recognized-in-hive