Scala or Java Library for fixing malformed URIs

有谁知道一个好的Scala或Java库可以解决格式错误的URI中的常见问题,例如包含应该转义但不是的字符?


I've tested a few libraries, including the now legacy URIUtil of HTTPClient without feeling I found any viable solution. Typically, I've had enough success with this type of java.net.URI construct though:

/**
 * Tries to construct an url by breaking it up into its smallest elements
 * and encode each component individually using the full URI constructor:
 *
 *    foo://example.com:8042/over/there?name=ferret#nose
 *    _/   ______________/_________/ _________/ __/
 *     |           |            |            |        |
 *  scheme     authority       path        query   fragment
 */
public URI parseUrl(String s) throws Exception {
   URL u = new URL(s);
   return new URI(
        u.getProtocol(), 
        u.getAuthority(), 
        u.getPath(),
        u.getQuery(), 
        u.getRef());
}

which may be used combination with the following routine. It repeatedly decodes an URL until the decoded string doesn't change, which can be useful against eg, double encoding. Note, to keep it simple, this sample doesn't feature any failsafe etc.

public String urlDecode(String url, String encoding) throws UnsupportedEncodingException, IllegalArgumentException {
    String result = URLDecoder.decode(url, encoding);
    return result.equals(url) ? result : urlDecode(result, encoding);
}

I would advise against using java.net.URLEncoder for percent encoding URIs. Despite the name, it is not great for encoding URLs as it does not follow the rfc3986 standard and instead encodes to the application/x-www-form-urlencoded MIME format (read more here)

For encoding URIs in Scala I would recommend the Uri class from spray-http. scala-uri is an alternative (disclaimer: I'm the author).

链接地址: http://www.djcxy.com/p/55246.html

上一篇: 替代形状点的绝对定位

下一篇: Scala或Java库来修复格式不正确的URI