Brackets in a Request URL are legal but not in a URI (Java)?
Apparently brackets are not allowed in URI paths.
I'm not sure if this is a Tomcat problem but I'm getting request with paths that contains ]
.
In otherwords
request.getRequestURL() == "http://localhost:8080/a]b"
request.getRequestURI() == "/a]b"
BTW getRequestURL() and URI are generally escaped ie for http://localhost:8080/ab
request.getRequestURL() == "http://localhost:8080/a%20b"
So if you try to do:
new URI("http://localhost:8080/a]b")
new URI(request.getRequestURL())
It will fail with a URI parsing exception. If I escape the path that will make the %20
double escaped.
How do I turn Servlet Request URLs into URIs?
Java's URI appears to be very strict and requires escaping for the Excluded US-ASCII Charset.
To fix this I encode those and only those characters minus the '%'
and '#'
as the URL may already contain those character. I used Http Clients URI utils which for some reason is not in HttpComponents.
private static BitSet badUriChars = new BitSet(256);
static {
badUriChars.set(0, 255, true);
badUriChars.andNot(org.apache.commons.httpclient.URI.unwise);
badUriChars.andNot(org.apache.commons.httpclient.URI.space);
badUriChars.andNot(org.apache.commons.httpclient.URI.control);
badUriChars.set('<', false);
badUriChars.set('>', false);
badUriChars.set('"', false);
}
public static URI toURIorFail(String url) throws URISyntaxException {
URI uri = URIUtil.encode(url, badUriChars, "UTF-8");
return new URI(uri);
}
Edit: Here are some related SO posts (more to come):
上一篇: URL中的Unicode字符