Java servlet下载文件名特殊字符

2018-06-07 05:18:58

我正在写一个简单的文件下载servlet，我无法得到正确的文件名。尝试URLEncoding和MimeEncoding在现有的答案中看到的文件名，但没有一个工作。

以下片段中的fileData对象包含需要至少ISO-8859-2字符集的MIME类型，字节[]内容和文件名，ISO-8859-1是不够的。

我如何让浏览器正确显示下载的文件名？

以下是文件名的一个例子：árvíztűrőtükörfúrógép.xls，结果如下：árvíztqrptükörfúrógép.xls

  protected void renderMergedOutputModel(Map model, HttpServletRequest req, HttpServletResponse res) throws Exception {

    RateDocument fileData = (RateDocument) model.get("command.retval");
    OutputStream out = res.getOutputStream();
    if(fileData != null) {
        res.setContentType(fileData.getMime());
        String enc = "utf-8"; //tried also: ISO-8859-2

        String encodedFileName = fileData.getName();
            // also tried URLencoding and mime encoding this filename without success

        res.setCharacterEncoding(enc); //tried with and without this
        res.setHeader("Content-Disposition", "attachment; filename=" + encodedFileName);
        res.setContentLength(fileData.getBody().length);
        out.write(fileData.getBody());
    } else {
        res.setContentType("text/html");
        out.write("<html><head></head><body>Error downloading file</body></html>"
                .getBytes(res.getCharacterEncoding()));
    }
    out.flush();
  }

我找到了适用于我安装的所有浏览器（IE8，FF16，Opera12，Chrome22）的解决方案。
它基于这样一个事实，即如果没有指定[不同的]编码，浏览器期望在文件名参数中使用浏览器本地编码编码的值。

通常浏览器的本地编码是utf-8（FireFox，Opera，Chrome）。但IE的本机编码是Win-1250。

因此，如果我们将值放入filename参数中，那么根据用户的浏览器将其编码为utf-8 / win-1250，它应该可以工作。至少，它适用于我。

String fileName = "árvíztűrőtükörfúrógép.xls";

String userAgent = request.getHeader("user-agent");
boolean isInternetExplorer = (userAgent.indexOf("MSIE") > -1);

try {
    byte[] fileNameBytes = fileName.getBytes((isInternetExplorer) ? ("windows-1250") : ("utf-8"));
    String dispositionFileName = "";
    for (byte b: fileNameBytes) dispositionFileName += (char)(b & 0xff);

    String disposition = "attachment; filename="" + dispositionFileName + """;
    response.setHeader("Content-disposition", disposition);
} catch(UnsupportedEncodingException ence) {
    // ... handle exception ...
}

当然，这只在上面提到的浏览器上进行过测试，我无法保证所有浏览器都能在100％的时间内运行。

注＃1（@fallen） ：使用URLEncoder.encode（）方法是不正确的。尽管方法的名称，它不会将字符串编码为URL编码，但它确实编码为表单编码。（表单编码与URL编码非常相似，在很多情况下它会产生相同的结果，但是有一些差异，例如空格字符“'编码不同：'+'而不是'％20'）。

对于正确的URL编码字符串，您应该使用URI类：

URI uri = new URI(null, null, "árvíztűrőtükörfúrógép.xls", null);
System.out.println(uri.toASCIIString());

不幸的是，这取决于浏览器。看到这个话题讨论这个问题。要解决您的问题，请在不同的浏览器中查看不同标题及其行为的示例。

基于这里给出的很好的答案，我已经开发了一个我已经投入生产的扩展版本。基于RFC 5987和这个测试套件。

String filename = "freaky-multibyte-chars";
StringBuilder contentDisposition = new StringBuilder("attachment");
CharsetEncoder enc = StandardCharsets.US_ASCII.newEncoder();
boolean canEncode = enc.canEncode(filename);
if (canEncode) {
    contentDisposition.append("; filename=").append('"').append(filename).append('"');
} else {
    enc.onMalformedInput(CodingErrorAction.IGNORE);
    enc.onUnmappableCharacter(CodingErrorAction.IGNORE);

    String normalizedFilename = Normalizer.normalize(filename, Form.NFKD);
    CharBuffer cbuf = CharBuffer.wrap(normalizedFilename);

    ByteBuffer bbuf;
    try {
        bbuf = enc.encode(cbuf);
    } catch (CharacterCodingException e) {
        bbuf = ByteBuffer.allocate(0);
    }

    String encodedFilename = new String(bbuf.array(), bbuf.position(), bbuf.limit(),
            StandardCharsets.US_ASCII);

    if (StringUtils.isNotEmpty(encodedFilename)) {
        contentDisposition.append("; filename=").append('"').append(encodedFilename)
                .append('"');
    }

    URI uri;
    try {
        uri = new URI(null, null, filename, null);
    } catch (URISyntaxException e) {
        uri = null;
    }

    if (uri != null) {
        contentDisposition.append("; filename*=UTF-8''").append(uri.toASCIIString());
    }

}

链接地址: http://www.djcxy.com/p/22163.html

上一篇: Java servlet download filename special characters

下一篇: Special Characters in Content