How can I detect DOS line breaks in a file?

前端 未结 7 636
半阙折子戏
半阙折子戏 2020-12-29 09:46

I have a bunch of files. Some are Unix line endings, many are DOS. I\'d like to test each file to see if if is dos formatted, before I switch the line endings.

How

7条回答
  •  一整个雨季
    2020-12-29 10:27

    Python can automatically detect what newline convention is used in a file, thanks to the "universal newline mode" (U), and you can access Python's guess through the newlines attribute of file objects:

    f = open('myfile.txt', 'U')
    f.readline()  # Reads a line
    # The following now contains the newline ending of the first line:
    # It can be "\r\n" (Windows), "\n" (Unix), "\r" (Mac OS pre-OS X).
    # If no newline is found, it contains None.
    print repr(f.newlines)
    

    This gives the newline ending of the first line (Unix, DOS, etc.), if any.

    As John M. pointed out, if by any chance you have a pathological file that uses more than one newline coding, f.newlines is a tuple with all the newline codings found so far, after reading many lines.

    Reference: http://docs.python.org/2/library/functions.html#open

    If you just want to convert a file, you can simply do:

    with open('myfile.txt', 'U') as infile:
        text = infile.read()  # Automatic ("Universal read") conversion of newlines to "\n"
    with open('myfile.txt', 'w') as outfile:
        outfile.write(text)  # Writes newlines for the platform running the program
    

提交回复
热议问题