handling newline character in hive

前端 未结 3 1115
离开以前
离开以前 2020-12-19 21:18

I have created a table in hive as

Create table(id int, Description String)  

My data looks something as follows :

 
1|This will         


        
相关标签:
3条回答
  • 2020-12-19 21:31

    I know this question is old, but you have a couple of options. You can't control this with FIELDS TERMINATED BY, because that only controls what terminates the fields, not the records. Records in Hive are hard-coded to be terminated by the newline character (even though there is a LINES TERMINATED BY clause, it is not implemented).

    1. Write a custom InputFormat that uses a RecordReader that understands non-newline delimited records. Look at the code for LineReader/LineRecordReader and TextInputFormat.
    2. Use a format other than text/ASCII, like Parquet. I would recommend this regardless, as text is probably the worst format you can store data in anyway.
    0 讨论(0)
  • 2020-12-19 21:39

    try adding the below property in hive-site.xml or you can just try for temporary hive session level.

    hive.query.result.fileformat=SequenceFile

    0 讨论(0)
  • 2020-12-19 21:45

    By default hive takes in NEWLINE ('\N') as delimiter . You can change the delimiter using:

        ROW FORMAT DELIMITED FIELDS TERMINATED BY ",";
    
    0 讨论(0)
提交回复
热议问题