Brackets in a Request URL are legal but not in a URI (Java)?

Apparently brackets are not allowed in URI paths.

I'm not sure if this is a Tomcat problem but I'm getting request with paths that contains ] .

In otherwords

request.getRequestURL() == "http://localhost:8080/a]b"
request.getRequestURI() == "/a]b"

BTW getRequestURL() and URI are generally escaped ie for http://localhost:8080/ab

request.getRequestURL() == "http://localhost:8080/a%20b"

So if you try to do:

new URI("http://localhost:8080/a]b")
new URI(request.getRequestURL())

It will fail with a URI parsing exception. If I escape the path that will make the %20 double escaped.

How do I turn Servlet Request URLs into URIs?


Java's URI appears to be very strict and requires escaping for the Excluded US-ASCII Charset.

To fix this I encode those and only those characters minus the '%' and '#' as the URL may already contain those character. I used Http Clients URI utils which for some reason is not in HttpComponents.

private static BitSet badUriChars = new BitSet(256);
static {
    badUriChars.set(0, 255, true);
    badUriChars.andNot(org.apache.commons.httpclient.URI.unwise);
    badUriChars.andNot(org.apache.commons.httpclient.URI.space);
    badUriChars.andNot(org.apache.commons.httpclient.URI.control);
    badUriChars.set('<', false);
    badUriChars.set('>', false);
    badUriChars.set('"', false);
}

public static URI toURIorFail(String url) throws URISyntaxException {
    URI uri = URIUtil.encode(url, badUriChars, "UTF-8");
    return new URI(uri);
}

Edit: Here are some related SO posts (more to come):

  • Which characters make a URL invalid?
  • 链接地址: http://www.djcxy.com/p/70212.html

    上一篇: URL中的Unicode字符

    下一篇: 请求URL中的括号是合法的,但不在URI(Java)中?