How to convert CRLF to LF on a Windows machine in Python

前端 未结 4 930
我寻月下人不归
我寻月下人不归 2020-12-14 20:20

So I got those template, they are all ending in LF and I can fill some terms inside with format and still get LF files by opening with "wb"

Those templates

4条回答
  •  刺人心
    刺人心 (楼主)
    2020-12-14 20:56

    Convert Line Endings in-place (with Python 3)

    Windows to Linux/Unix

    Here is a short script for directly converting Windows line endings (\r\n also called CRLF) to Linux/Unix line endings (\n also called LF) in-place (without creating an extra output file):

    # replacement strings
    WINDOWS_LINE_ENDING = b'\r\n'
    UNIX_LINE_ENDING = b'\n'
    
    # relative or absolute file path, e.g.:
    file_path = r"c:\Users\Username\Desktop\file.txt"
    
    with open(file_path, 'rb') as open_file:
        content = open_file.read()
    
    content = content.replace(WINDOWS_LINE_ENDING, UNIX_LINE_ENDING)
    
    with open(file_path, 'wb') as open_file:
        open_file.write(content)
    

    Linux/Unix to Windows

    Just swap the constants for the line endings in the str.replace() call like so: content.replace(UNIX_LINE_ENDING, WINDOWS_LINE_ENDING).


    Code Explanation

    • Important: Binary Mode We need to make sure that we open the file both times in binary mode (mode='rb' and mode='wb') for the conversion to work.

      When opening files in text mode (mode='r' or mode='w' without b), the platform's native line endings (\r\n on Windows and \r on old Mac OS versions) are automatically converted to Python's Unix-style line endings: \n. So the call to content.replace() couldn't find any \r\n line endings to replace.

      In binary mode, no such conversion is done. Therefore the call to str.replace() can do its work.

    • Binary Strings In Python 3, if not declared otherwise, strings are stored as Unicode (UTF-8). But we open our files in binary mode - therefore we need to add b in front of our replacement strings to tell Python to handle those strings as binary, too.

    • Raw Strings On Windows the path separator is a backslash \ which we would need to escape in a normal Python string with \\. By adding r in front of the string we create a so called "raw string" which doesn't need any escaping. So you can directly copy/paste the path from Windows Explorer into your script.

      (Hint: Inside Windows Explorer press CTRL+L to automatically select the path from the address bar.)

    • Alternative We open the file twice to avoid the need of repositioning the file pointer. We also could have opened the file once with mode='rb+' but then we would have needed to move the pointer back to start after reading its content (open_file.seek(0)) and truncate its original content before writing the new one (open_file.truncate(0)).

      Simply opening the file again in write mode does that automatically for us.

    Cheers and happy programming,
    winklerrr

提交回复
热议问题