handling newline character in hive

前端 未结 3 1120
离开以前
离开以前 2020-12-19 21:18

I have created a table in hive as

Create table(id int, Description String)  

My data looks something as follows :

 
1|This will         


        
3条回答
  •  清酒与你
    2020-12-19 21:31

    I know this question is old, but you have a couple of options. You can't control this with FIELDS TERMINATED BY, because that only controls what terminates the fields, not the records. Records in Hive are hard-coded to be terminated by the newline character (even though there is a LINES TERMINATED BY clause, it is not implemented).

    1. Write a custom InputFormat that uses a RecordReader that understands non-newline delimited records. Look at the code for LineReader/LineRecordReader and TextInputFormat.
    2. Use a format other than text/ASCII, like Parquet. I would recommend this regardless, as text is probably the worst format you can store data in anyway.

提交回复
热议问题