Bash: Parse CSV with quotes, commas and newlines

前端 未结 7 1645
隐瞒了意图╮
隐瞒了意图╮ 2020-12-11 16:10

Say I have the following csv file:

 id,message,time
 123,\"Sorry, This message
 has commas and newlines\",2016-03-28T20:26:39
 456,\"It makes the problem non         


        
7条回答
  •  情歌与酒
    2020-12-11 16:39

    As said here

    gawk -v RS='"' 'NR % 2 == 0 { gsub(/\n/, "") } { printf("%s%s", $0, RT) }' file.csv \
     | awk -F, '{print $NF}'
    

    To handle specifically those newlines that are in doubly-quoted strings and leave those alone that are outside them, using GNU awk (for RT):

    gawk -v RS='"' 'NR % 2 == 0 { gsub(/\n/, "") } { printf("%s%s", $0, RT) }' file
    

    This works by splitting the file along " characters and removing newlines in every other block.

    Output

    time
    2016-03-28T20:26:39
    2016-03-28T20:26:41
    

    Then use awk to split the columns and display the last column

提交回复
热议问题