Remove \r (CR) from CSV

后端 未结 3 2105
自闭症患者
自闭症患者 2020-12-11 09:17

On OSX I need to remove line-ending CR (\\r) characters (represented as ^M in the output from cat -v) from my CSV file:



        
相关标签:
3条回答
  • 2020-12-11 09:33

    Solutions with stock utilities:

    Note: Except where noted (the sed -i incompatibility), the following solutions work on both OSX (macOS) and Linux.

    Use sed as follows, which replaces \r\n with \n:

    sed $'s/\r$//' myitems.csv
    

    To update the input file in place, use

    sed -i '' $'s/\r$//' myitems.csv
    

    -i '' specifies updating in place, with '' indicating that no backup should be made of the input file; if you specify a extension, e.g., -i'.bak', the original input file will be saved with that extension as a backup.
    Caveats:
    * With GNU sed (Linux), to not create a backup file, you'd have to use just -i, without the separate '' argument, which is an unfortunate syntactic incompatibility between GNU Sed and the BSD Sed used on OSX (macOS) - see this answer of mine for the full story.
    * -i creates a new file with a temporary name and then replaces the original file; the most notably consequence is that if the original file was a symlink, it is replaced with a regular file; for a detailed discussion, see the lower half of this answer.

    Note: The above uses an ANSI C-quoted string ($'...') to create the \r character in the sed command, because BSD sed (the one used on OS X), doesn't natively recognize such escape sequences (note that the GNU sed used on Linux distros would).
    ANSI C-quoted strings are supported in Bash, Ksh, and Zsh.

    If you don't want to rely on such strings, use:

    sed 's/'"$(printf '\r')"'$//'
    

    Here, the \r is created via printf and spliced into the sed command with a command substitution ($(...)).


    Using perl:

    perl -pe 's/\r\n/\n/' myitems.csv | cat -v
    

    To update the input file in place, use

    perl -i -ple 's/\r\n/\n/' myitems.csv  # -i'.bak' creates backup with suffix '.bak' first
    

    The same caveat as above for sed with regard to in-place updating applies.


    Using awk:

    awk '{ sub("\r$", ""); print }' myitems.csv  # shorter: awk 'sub("\r$", "")+1'
    

    BSD awk offers no in-place updating option, so you'll have to capture the output in a different file; to use a temporary file and have it replace the original afterward, use the following idiom:

    awk '{ sub("\r$", ""); print }' myitems.csv > tmpfile && mv tmpfile myitems.csv
    

    GNU awk v4.1 or higher offers -i inplace for in-place updating, to which the same caveat as above for sed applies.


    Edge case for all variants above: If the very last char. in the input file happens to be a lone \r without a following \n, it will also be replaced with a \n.


    For the sake of completeness: here are additional, possibly suboptimal solutions:

    None of them offer in-place updating, but you can employ the > tmpfile && mv tmpfile myitems.csv idiom introduced above


    Using tr: a very simple solution that simply removes all \r instances; thus, it can only be used if \r instance only occur as part of \r\n sequences; typically, however, that is the case:

    tr -d '\r' < myitems.csv
    

    Using pure bash code: note that this will be slow; like the tr solution, this can only be used if \r instance only occur as part of \r\n sequences.

    while IFS=$'\r' read -r line; do
      printf '%s\n' "$line"
    done < myitems.csv
    

    $IFS is the internal field separator, and setting it to \r causes read to read everything before \r, if present, into variable $line (if there's no \r, the line is read as is). -r prevents read from interpreting \ instances in the input.

    Edge case: If the input doesn't end with \n, the last line will not print - you could fix that by using read -r line || [[ -n $line ]].

    0 讨论(0)
  • 2020-12-11 09:35

    Try the unix2dos command.

    Example: unix2dos infile outfile

    http://en.wikipedia.org/wiki/Unix2dos

    The wikipedia page has some examples using perl and sed too.

    perl -i -p -e 's/\n/\r\n/' file
    sed -i -e 's/$/\r/' file
    
    0 讨论(0)
  • 2020-12-11 09:49

    try this, it will fix your issue.

    dos2unix myitems.csv myitems.csv
    
    0 讨论(0)
提交回复
热议问题