How to convert CRLF to LF on a Windows machine in Python

前端 未结 4 921
我寻月下人不归
我寻月下人不归 2020-12-14 20:20

So I got those template, they are all ending in LF and I can fill some terms inside with format and still get LF files by opening with "wb"

Those templates

相关标签:
4条回答
  • 2020-12-14 20:46

    why don't you try below:: str.replace('\r\n','\n');

    CRLF => \r\n LF => \n

    it's history of typewriter =)

    0 讨论(0)
  • 2020-12-14 20:56

    Convert Line Endings in-place (with Python 3)

    Windows to Linux/Unix

    Here is a short script for directly converting Windows line endings (\r\n also called CRLF) to Linux/Unix line endings (\n also called LF) in-place (without creating an extra output file):

    # replacement strings
    WINDOWS_LINE_ENDING = b'\r\n'
    UNIX_LINE_ENDING = b'\n'
    
    # relative or absolute file path, e.g.:
    file_path = r"c:\Users\Username\Desktop\file.txt"
    
    with open(file_path, 'rb') as open_file:
        content = open_file.read()
    
    content = content.replace(WINDOWS_LINE_ENDING, UNIX_LINE_ENDING)
    
    with open(file_path, 'wb') as open_file:
        open_file.write(content)
    

    Linux/Unix to Windows

    Just swap the constants for the line endings in the str.replace() call like so: content.replace(UNIX_LINE_ENDING, WINDOWS_LINE_ENDING).


    Code Explanation

    • Important: Binary Mode We need to make sure that we open the file both times in binary mode (mode='rb' and mode='wb') for the conversion to work.

      When opening files in text mode (mode='r' or mode='w' without b), the platform's native line endings (\r\n on Windows and \r on old Mac OS versions) are automatically converted to Python's Unix-style line endings: \n. So the call to content.replace() couldn't find any \r\n line endings to replace.

      In binary mode, no such conversion is done. Therefore the call to str.replace() can do its work.

    • Binary Strings In Python 3, if not declared otherwise, strings are stored as Unicode (UTF-8). But we open our files in binary mode - therefore we need to add b in front of our replacement strings to tell Python to handle those strings as binary, too.

    • Raw Strings On Windows the path separator is a backslash \ which we would need to escape in a normal Python string with \\. By adding r in front of the string we create a so called "raw string" which doesn't need any escaping. So you can directly copy/paste the path from Windows Explorer into your script.

      (Hint: Inside Windows Explorer press CTRL+L to automatically select the path from the address bar.)

    • Alternative We open the file twice to avoid the need of repositioning the file pointer. We also could have opened the file once with mode='rb+' but then we would have needed to move the pointer back to start after reading its content (open_file.seek(0)) and truncate its original content before writing the new one (open_file.truncate(0)).

      Simply opening the file again in write mode does that automatically for us.

    Cheers and happy programming,
    winklerrr

    0 讨论(0)
  • 2020-12-14 21:07

    Python's open function supports the 'rU' mode for universal newlines, in which case it doesn't mind which sort of newline each line has. In Python 3 you can also request a specific form of newline with the newline argument for open.

    Translating from one form to the other is thus rather simple in Python:

    with open('filename.in', 'rU') as infile,                 \
       open('filename.out', 'w', newline='\n') as outfile:
           outfile.writelines(infile.readlines())
    

    (Due to the newline argument, the U is actually deprecated in Python 3; the equivalent form is newline=None.)

    0 讨论(0)
  • 2020-12-14 21:08

    It is possible to fix existing templates with messed-up ending with this code:

    with open('file.tpl') as template:
       lines = [line.replace('\r\n', '\n') for line in template]
    with open('file.tpl', 'w') as template:
       template.writelines(lines)
    
    0 讨论(0)
提交回复
热议问题