How to handle fields enclosed within quotes(CSV) in importing data from S3 into DynamoDB using EMR/Hive

前端 未结 7 1099
梦毁少年i
梦毁少年i 2020-12-28 17:14

I am trying to use EMR/Hive to import data from S3 into DynamoDB. My CSV file has fields which are enclosed within double quotes and separated by comma. While creating exter

7条回答
  •  离开以前
    2020-12-28 18:00

    I was also stuck with the same issue as my fields are enclosed with double quotes and separated by semicolon(;). My table name is employee1.

    So I have searched with links and I have found perfect solution for this.

    We have to use serde for this. Please download serde jar using this link : https://github.com/downloads/IllyaYalovyy/csv-serde/csv-serde-0.9.1.jar

    then follow below steps using hive prompt :

    add jar path/to/csv-serde.jar;
    
    create table employee1(id string, name string, addr string)
    row format serde 'com.bizo.hive.serde.csv.CSVSerde'
    with serdeproperties(
    "separatorChar" = "\;",
    "quoteChar" = "\"")
    stored as textfile
    ;
    

    and then load data from your given path using below query:

    load data local inpath 'path/xyz.csv' into table employee1;
    

    and then run :

    select * from employee1;
    

    Now you will see the magic. Thanks.

提交回复
热议问题