How to determine the line ending of a file

前端 未结 7 1800
忘掉有多难
忘掉有多难 2020-12-23 09:32

I have a bunch (hundreds) of files that are supposed to have Unix line endings. I strongly suspect that some of them have Windows line endings, and I want to programmaticall

相关标签:
7条回答
  • 2020-12-23 09:52

    Here's the most failsafe answer. Stimms answer doesn account for subdirectories and binary files

    find . -type f -exec file {} \; | grep "CRLF" | awk -F ':' '{ print $1 }'
    
    • Use file to find file type. Those with CRLF have windows return characters. The output of file is delimited by a :, and the first field is the path of the file.
    0 讨论(0)
  • 2020-12-23 09:53

    You can use the file tool, which will tell you the type of line ending. Or, you could just use dos2unix -U which will convert everything to Unix line endings, regardless of what it started with.

    0 讨论(0)
  • 2020-12-23 09:54

    Unix uses one byte, 0x0A (LineFeed), while windows uses two bytes, 0x0D 0x0A (Carriage Return, Line feed).

    If you never see a 0x0D, then it's very likely Unix. If you see 0x0D 0x0A pairs then it's very likely MSDOS.

    0 讨论(0)
  • 2020-12-23 09:54

    When you know which files has Windows line endings (0x0D 0x0A or \r \n), what you will do with that files? I supose, you will convert them into Unix line ends (0x0A or \n). You can convert file with Windows line endings into Unix line endings with sed utility, just use command:

    $> sed -i 's/\r//' my_file_with_win_line_endings.txt
    

    You can put it into script like this:

    #!/bin/bash
    
    function travers()
    {
        for file in $(ls); do
            if [ -f "${file}" ]; then
                sed -i 's/\r//' "${file}"
            elif [ -d "${file}" ]; then
                cd "${file}"
                travers
                cd ..
            fi
        done
    }
    
    travers
    

    If you run it from your root dir with files, at end you will be sure all files are with Unix line endings.

    0 讨论(0)
  • 2020-12-23 09:56

    You could use grep

    egrep -l $'\r'\$ *
    
    0 讨论(0)
  • 2020-12-23 09:56

    Something along the lines of:

    perl -p -e 's[\r\n][WIN\n]; s[(?<!WIN)\n][UNIX\n]; s[\r][MAC\n];' FILENAME
    

    though some of that regexp may need refining and tidying up.

    That'll output your file with WIN, MAC, or UNIX at the end of each line. Good if your file is somehow a dreadful mess (or a diff) and has mixed endings.

    0 讨论(0)
提交回复
热议问题