How does one encode query parameters to go on a url in Java? I know, this seems like an obvious and already asked question.
There are two subtleties I'm not sure of:
- Should spaces be encoded on the url as "+" or as "%20"? In chrome if I type in "http://google.com/foo=?bar me" chrome changes it to be encoded with %20
- Is it necessary/correct to encode colons ":" as %3B? Chrome doesn't.
Notes:
java.net.URLEncoder.encode
doesn't seem to work, it seems to be for encoding data to be form submitted. For example, it encodes space as+
instead of%20
, and encodes colon which isn't necessary.java.net.URI
doesn't encode query parameters
java.net.URLEncoder.encode(String s, String encoding)
can help too. It follows the HTML form encoding application/x-www-form-urlencoded
.
URLEncoder.encode(query, "UTF-8");
On the other hand, Percent-encoding (also known as URL encoding) encodes space with %20
. Colon is a reserved character, so :
will still remain a colon, after encoding.
EDIT: URIUtil
is no longer available in more recent versions, better answer at Java - encode URL or by Mr. Sindi in this thread.
URIUtil
of Apache httpclient is really useful, although there are some alternatives
URIUtil.encodeQuery(url);
For example, it encodes space as "+" instead of "%20"
Both are perfectly valid in the right context. Although if you really preferred you could issue a string replace.
Unfortunately, URLEncoder.encode() does not produce valid percent-encoding (as specified in http://tools.ietf.org/html/rfc3986#section-2.1).
URLEncoder.encode() encodes everything just fine, except space is encoded to "+". All the Java URI encoders that I could find only expose public methods to encode the query, fragment, path parts etc. - but don't expose the "raw" encoding. This is unfortunate as fragment and query are allowed to encode space to +, so we don't want to use them. Path is encoded properly but is "normalized" first so we can't use it for 'generic' encoding either.
Best solution I could come up with:
return URLEncoder.encode(raw, "UTF-8").replaceAll("\\+", "%20");
If replaceAll()
is too slow for you, I guess the alternative is to roll your own encoder...
EDIT: I had this code in here first which doesn't encode "?", "&", "=" properly:
//don't use - doesn't properly encode "?", "&", "="
new URI(null, null, null, raw, null).toString().substring(1);
It is not necessary to encode a colon as %3B in the query, although doing so is not illegal.
URI = scheme ":" hier-part [ "?" query ] [ "#" fragment ]
query = *( pchar / "/" / "?" )
pchar = unreserved / pct-encoded / sub-delims / ":" / "@"
unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"
pct-encoded = "%" HEXDIG HEXDIG
sub-delims = "!" / "$" / "&" / "'" / "(" / ")" / "*" / "+" / "," / ";" / "="
It also seems that only percent-encoded spaces are valid, as I doubt that space is an ALPHA or a DIGIT
look to the URI specification for more details.
The built in Java URLEncoder is doing what it's supposed to, and you should use it.
A "+" or "%20" are both valid replacements for a space character in a URL. Either one will work.
A ":" should be encoded, as it's a separator character. i.e. http://foo or ftp://bar. The fact that a particular browser can handle it when it's not encoded doesn't make it correct. You should encode them.
As a matter of good practice, be sure to use the method that takes a character encoding parameter. UTF-8 is generally used there, but you should supply it explicitly.
URLEncoder.encode(yourUrl, "UTF-8");
来源:https://stackoverflow.com/questions/5330104/encoding-url-query-parameters-in-java