URLConnection does not get the charset

后端 未结 3 1562
眼角桃花
眼角桃花 2020-12-15 05:53

I\'m using URL.openConnection() to download something from a server. The server says

Content-Type: text/plain; charset=utf-8

B

3条回答
  •  不思量自难忘°
    2020-12-15 06:15

    The value returned from URLConnection.getContentEncoding() returns the value from header Content-Encoding

    Code from URLConnection.getContentEncoding()

    /**
         * Returns the value of the content-encoding header field.
         *
         * @return  the content encoding of the resource that the URL references,
         *          or null if not known.
         * @see     java.net.URLConnection#getHeaderField(java.lang.String)
         */
        public String getContentEncoding() {
           return getHeaderField("content-encoding");
        }
    

    Instead, rather do a connection.getContentType() to retrieve the Content-Type and retrieve the charset from the Content-Type. I've included a sample code on how to do this....

    String contentType = connection.getContentType();
    String[] values = contentType.split(";"); // values.length should be 2
    String charset = "";
    
    for (String value : values) {
        value = value.trim();
    
        if (value.toLowerCase().startsWith("charset=")) {
            charset = value.substring("charset=".length());
        }
    }
    
    if ("".equals(charset)) {
        charset = "UTF-8"; //Assumption
    }
    

提交回复
热议问题