问题
Python 3.7 introduced the PYTHONUTF8 environment variable to enable UTF-8 encoding by default. How do I set this variable from within a Python program? (I can't find it in my operating system's list of environment variables.)
回答1:
To access environment variables, and modify them if your platform allows it (which Windows and all popular Unixes do), just use os.environ.
However, this isn’t going to do any good, unless you’re trying to set the environment variable for Python child processes that you’re launching with subprocess or the like. Python reads its environment variables at startup, uses them to pick up configuration information, and doesn’t check them again later.
The point of these environment variables (and command-line flags) is to set them in your shell, launcher script, etc., so they’re available when Python starts, not to set them from within Python.
Normally, if you need this setting, you’re going to need it globally, so you’ll want to set it in your shell profile script (for Linux), your OS’s GUI for environment variables (for Windows), or both (for macOS—although on Mac, everything is already guaranteed to be set to UTF-8, and I believe even if you manage to break that somehow, Python will ignore it).
You’re not going to find this in your existing list of environment variables (unless maybe you’re on an unusual Linux distro that does something odd with the locale settings but needs its default Python to ignore them), but that doesn’t matter; you can add any environment variables you want.
But if you want to change things on the fly, while you can’t do that by setting an environment variable, you don’t need to, either.
As the docs explain, what it controls is setting the filesystem encoding, preferred encoding, and stdio files encoding.
The first two, you can just call the same functions in sys and locale to set them at any time.
If you also want to change the stdio files, that’s a bit trickier. I believe the proposal to make it easier to change the encoding for these files on the fly was rejected or deferred, so the only thing you can do is replace them with new file objects wrapped around the same file descriptor, which looks something like this (untested for now):
sys.stdout = open(sys.stdout.fileno(), 'w', encoding='utf-8', errors='surrogateescape')
sys.stderr = open(sys.stderr.fileno(), 'w', encoding='utf-8', errors='backslashescape')
sys.stdin = open(sys.stdin.fileno(), 'r', encoding='utf-8', errors='surrogateescape')
If you’ve already printed anything to stdout or typed/piped anything into stdin, you may need to flush everything first.
The only remaining issue that I know of is that sys.argv and os.environ will (at least on Unix) have already been decoded with the wrong encoding. You can fix the args by reencoding and redecoding before setting the default encodings. I think this uses the locale settings, so it would look like:
sys.argv = [arg.encode(locale.getpreferredencoding(), errors='surrogateescape').decode('utf8', errors='surrogateescape') for arg in sys.argv]
Fixing the environment is a bit trickier, because if you try to mutate os.environ it’s going to do a putenv call that you don’t want. If this is an issue, the best option is probably to make a transcoded copy of environ and use that for lookups, and explicitly pass it to subprocess, etc.
回答2:
Usually you would specify this with a command line argument
python3.7 -X utf8
If you want to enable UTF-8 mode from environment variable:
export PYTHONUTF8=1 # linux / macOS
set PYTHONUTF8=1 # windows
It should be set before entering the Python runtime.
回答3:
You can technically set it using os.environ["PYTHONUTF8"] = "on" (the value doesn't matter), but this won't affect the running script. By the time you do it in a Python script, Python has already started and checked for this variable, didn't find it, and therefore isn't using UTF-8 encoding by default. It would affect any Python interpreter you launched from your script, though.
The point of the environment variable is to set it before you launch your Python script. You do this in the same way you would set any other environment variable. You don't mention what OS you're using, but on Linux, you generally add the appropriate command to ~/.bash_profile. On Windows, you set them using a button on the Advanced page of the System Properties dialog.
Obviously you won't find this environment variable in your system's list of environment variables if you haven't set it yet.
来源:https://stackoverflow.com/questions/50933194/how-do-i-set-the-pythonutf8-environment-variable-to-enable-utf-8-encoding-by-def