Char array to byte array in UTF

I have a little question. I have to encode char array with UTF-8 and get the byte array equivalent of it by using Java. Converting the char array to String and than getting the byte array is not an option, String must be avoided, because of security concerns. If I use

byte[] encoded = Charset.forName("UTF-8").encode(CharBuffer.wrap(toBeEncoded)).array();

When the length of the input array is more than 9 symbols, the output array has an extra element which is empty. If the length is even longer, there are more empty elements. Then I decode it, I get extra extra more elements. If after encoding I have 1 empty element, after decoding there are two. This is not an option too, because I want to encrypt the encoded value. Thank you.


The problem is that Charset.encode() makes no guarantees about the capacity of the buffer it returns. It very well might allocate extra space at the end, which is what you are seeing. However, the buffer's limit will be set correctly. In fact, there is no guarantee that the returned buffer will be backed by an array at all (it could be made a direct buffer in future Java versions, who knows?)

To get a properly sized array you'll need to make a properly sized byte array and copy only the data you want from the byte buffer into that array. Here we use the limit (which is the amount of content actually written into the buffer) to size the new array:

ByteBuffer buf = StandardCharsets.UTF_8.encode(CharBuffer.wrap(toBeEncoded));
byte[] array = new byte[buf.limit()];
buf.get(array);

This article describes the limit, capacity and position of buffers nicely.

链接地址: http://www.djcxy.com/p/78438.html

上一篇: 通过StringBuilder输出流

下一篇: 字符数组中的字符数组以UTF格式表示