“ValueError: embedded null character” when using open()

后端 未结 4 1163
一整个雨季
一整个雨季 2021-01-04 06:48

I am taking python at my college and I am stuck with my current assignment. We are supposed to take 2 files and compare them. I am simply trying to open the files so I can

4条回答
  •  天命终不由人
    2021-01-04 07:37

    Default encoding of files for Python 3.5 is 'utf-8'.

    Default encoding of files for Windows tends to be something else.

    If you intend to open two text files, you may try this:

    import locale
    locale.getdefaultlocale()
    file1 = input("Enter the name of the first file: ")
    file1_open = open(file1, encoding=locale.getdefaultlocale()[1])
    file1_content = file1_open.read()
    

    There should be some automatic detection in the standard library.

    Otherwise you may create your own:

    def guess_encoding(csv_file):
        """guess the encoding of the given file"""
        import io
        import locale
        with io.open(csv_file, "rb") as f:
            data = f.read(5)
        if data.startswith(b"\xEF\xBB\xBF"):  # UTF-8 with a "BOM"
            return "utf-8-sig"
        elif data.startswith(b"\xFF\xFE") or data.startswith(b"\xFE\xFF"):
            return "utf-16"
        else:  # in Windows, guessing utf-8 doesn't work, so we have to try
            try:
                with io.open(csv_file, encoding="utf-8") as f:
                    preview = f.read(222222)
                    return "utf-8"
            except:
                return locale.getdefaultlocale()[1]
    

    and then

    file1 = input("Enter the name of the first file: ")
    file1_open = open(file1, encoding=guess_encoding(file1))
    file1_content = file1_open.read()
    

提交回复
热议问题