Json converts & in a String to \u0026

I am trying to extract text from pdf and write it into a json file. While extracting unicode characters the Json converts all & to u0026. For example my actual String is &#1588 . (which represents ش). It prints correctly to a .txt file, to console etc. But when I try to print this string to a Json file it shows u0026#1588; .

I am using Java, and the code is

Gson gson = new Gson();
String json = gson.toJson(pdfDoc);

Note: pdfDoc is an object, that contains all the details (position, color, font.. etc) of characters inside the input PDF document. I am using gson-2.2.1.jar .


That's actually a valid (but not required) encoding. Any character may be encoded using the unicode escape in JSON and any valid JSON parsing library must be able to interpret those escapes.

& is not part of the characters that need encoding (see the definition of string at json.org), but there are a few JSON libraries that are quite "aggressive" in their encoding. That's not usually a problem, unless you don't really handle the resulting JSON with a conforming JSON parser.

GsonBuilder.disableHtmlEscaping() will help you turn that feature off if you absolutely need to.

链接地址: http://www.djcxy.com/p/88674.html

上一篇: 无法在BIRT生成的PDF中显示印地文(Unicode)字符

下一篇: Json将&字符串转换为\ u0026