URLConnection does not get the charset

后端 未结 3 1561
眼角桃花
眼角桃花 2020-12-15 05:53

I\'m using URL.openConnection() to download something from a server. The server says

Content-Type: text/plain; charset=utf-8

B

相关标签:
3条回答
  • 2020-12-15 06:15

    The value returned from URLConnection.getContentEncoding() returns the value from header Content-Encoding

    Code from URLConnection.getContentEncoding()

    /**
         * Returns the value of the <code>content-encoding</code> header field.
         *
         * @return  the content encoding of the resource that the URL references,
         *          or <code>null</code> if not known.
         * @see     java.net.URLConnection#getHeaderField(java.lang.String)
         */
        public String getContentEncoding() {
           return getHeaderField("content-encoding");
        }
    

    Instead, rather do a connection.getContentType() to retrieve the Content-Type and retrieve the charset from the Content-Type. I've included a sample code on how to do this....

    String contentType = connection.getContentType();
    String[] values = contentType.split(";"); // values.length should be 2
    String charset = "";
    
    for (String value : values) {
        value = value.trim();
    
        if (value.toLowerCase().startsWith("charset=")) {
            charset = value.substring("charset=".length());
        }
    }
    
    if ("".equals(charset)) {
        charset = "UTF-8"; //Assumption
    }
    
    0 讨论(0)
  • 2020-12-15 06:29

    Just as an addition to the answer from @Buhake Sindi. If you are using Guava, instead of the manual parsing you can do:

    MediaType mediaType = MediaType.parse(httpConnection.getContentType());
    Optional<Charset> typeCharset = mediaType.charset();
    
    0 讨论(0)
  • This is documented behaviour as the getContentEncoding() method is specified to return the contents of the Content-Encoding HTTP header, which is not set in your example. You could use the getContentType() method and parse the resulting String on your own, or possibly go for a more advanced HTTP client library like the one from Apache.

    0 讨论(0)
提交回复
热议问题