How to convert from character positions to byte postions in UTF
I have UTF-8 encoded text file. I can read it by chars. Each char can be either one byte or multibyte. How can I know where one byte was readen and whet it was readen more than one byte?
Count the bytes while reading the char
s.
For each char c
:
if(c<128)
bytesCount++;
else if (c<2048)
bytesCount+=2;
else
bytesCount+=3;
See also encodeing definition wikipedia URF8
链接地址: http://www.djcxy.com/p/78434.html上一篇: 将字节数组转换为给定编码的字符串
下一篇: 如何将字符位置转换为UTF中的字节位置