URLConnection does not get the charset

无人久伴 提交于 2019-11-27 14:37:24

问题


I'm using URL.openConnection() to download something from a server. The server says

Content-Type: text/plain; charset=utf-8

But connection.getContentEncoding() returns null. What up?


回答1:


This is documented behaviour as the getContentEncoding() method is specified to return the contents of the Content-Encoding HTTP header, which is not set in your example. You could use the getContentType() method and parse the resulting String on your own, or possibly go for a more advanced HTTP client library like the one from Apache.




回答2:


The value returned from URLConnection.getContentEncoding() returns the value from header Content-Encoding

Code from URLConnection.getContentEncoding()

/**
     * Returns the value of the <code>content-encoding</code> header field.
     *
     * @return  the content encoding of the resource that the URL references,
     *          or <code>null</code> if not known.
     * @see     java.net.URLConnection#getHeaderField(java.lang.String)
     */
    public String getContentEncoding() {
       return getHeaderField("content-encoding");
    }

Instead, rather do a connection.getContentType() to retrieve the Content-Type and retrieve the charset from the Content-Type. I've included a sample code on how to do this....

String contentType = connection.getContentType();
String[] values = contentType.split(";"); // values.length should be 2
String charset = "";

for (String value : values) {
    value = value.trim();

    if (value.toLowerCase().startsWith("charset=")) {
        charset = value.substring("charset=".length());
    }
}

if ("".equals(charset)) {
    charset = "UTF-8"; //Assumption
}



回答3:


Just as an addition to the answer from @Buhake Sindi. If you are using Guava, instead of the manual parsing you can do:

MediaType mediaType = MediaType.parse(httpConnection.getContentType());
Optional<Charset> typeCharset = mediaType.charset();


来源:https://stackoverflow.com/questions/3934251/urlconnection-does-not-get-the-charset

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!