Brackets in a Request URL are legal but not in a URI (Java)?

元气小坏坏 提交于 2020-01-03 11:00:10

问题


Apparently brackets are not allowed in URI paths.

I'm not sure if this is a Tomcat problem but I'm getting request with paths that contains ].

In otherwords

request.getRequestURL() == "http://localhost:8080/a]b"
request.getRequestURI() == "/a]b"

BTW getRequestURL() and URI are generally escaped ie for http://localhost:8080/a b

request.getRequestURL() == "http://localhost:8080/a%20b"

So if you try to do:

new URI("http://localhost:8080/a]b")
new URI(request.getRequestURL())

It will fail with a URI parsing exception. If I escape the path that will make the %20 double escaped.

How do I turn Servlet Request URLs into URIs?


回答1:


Java's URI appears to be very strict and requires escaping for the Excluded US-ASCII Charset.

To fix this I encode those and only those characters minus the '%' and '#' as the URL may already contain those character. I used Http Clients URI utils which for some reason is not in HttpComponents.

private static BitSet badUriChars = new BitSet(256);
static {
    badUriChars.set(0, 255, true);
    badUriChars.andNot(org.apache.commons.httpclient.URI.unwise);
    badUriChars.andNot(org.apache.commons.httpclient.URI.space);
    badUriChars.andNot(org.apache.commons.httpclient.URI.control);
    badUriChars.set('<', false);
    badUriChars.set('>', false);
    badUriChars.set('"', false);
}

public static URI toURIorFail(String url) throws URISyntaxException {
    URI uri = URIUtil.encode(url, badUriChars, "UTF-8");
    return new URI(uri);
}

Edit: Here are some related SO posts (more to come):

  • Which characters make a URL invalid?


来源:https://stackoverflow.com/questions/11038967/brackets-in-a-request-url-are-legal-but-not-in-a-uri-java

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!