Why does Java's String.getBytes() uses “ISO-8859-1”

前端 未结 4 1610
广开言路
广开言路 2020-12-01 08:34

from java.lang.StringCoding :

String csn = (charsetName == null) ? \"ISO-8859-1\" : charsetName;

This is what is used from Java.lang.getByt

4条回答
  •  天涯浪人
    2020-12-01 09:00

    That's for compatibility reason.

    Historically, all java methods on Windows and Unix not specifying a charset were using the common one at the time, that is "ISO-8859-1".

    As mentioned by Isaac and the javadoc, the default platform encoding is used (see Charset.java) :

    594    public static Charset defaultCharset() {
    595        if (defaultCharset == null) {
    596            synchronized (Charset.class) {
    597                String csn = AccessController.doPrivileged(
    598                    new GetPropertyAction("file.encoding"));
    599                Charset cs = lookup(csn);
    600                if (cs != null)
    601                    defaultCharset = cs;
    602                else
    603                    defaultCharset = forName("UTF-8");
    604            }
    605        }
    606        return defaultCharset;
    607    }
    

    Always specify the charset when doing string to bytes or bytes to string conversion.

    Even when, as is the case for String.getBytes() you still find a non deprecated method not taking the charset (most of them were deprecated when Java 1.1 appeared). Just like with endianness, the platform format is irrelevant, what is relevant is the norm of the storage format.

提交回复
热议问题