What's a portable way of converting Byte
I am trying to write server that will communicate with any standard client that can make socket connections (eg telnet client)
It started out as an echo server, which of course did not need to worry about network byte ordering.
I am familiar with ntohs, ntohl, htons, htonl functions. These would be great by themselves if I were transfering either 16 or 32-bit ints, or if the characters in the string being sent were multiples of 2 or 4 bytes.
I'd like create a function that operates on strings such as:
str_ntoh(char* net_str, char* host_str, int len)
{
uint32_t* netp, hostp;
netp = (uint32_t*)&net_str;
for(i=0; i < len/4; i++){
hostp[i] = ntoh(netp[i]);
}
}
Or something similar. The above thing assumes that the wordsize is 32-bits. We can't be sure that the wordsize on the sending machine is not 16-bits, or 64-bits right?
For client programs, such as telnet, they must be using hton* before they send and ntoh* after they receive data, correct?
EDIT: For the people that thing because 1-char is a byte that endian-ness doesn't matter:
int main(void)
{
uint32_t a = 0x01020304;
char* c = (char*)&a;
printf("%x %x %x %xn", c[0], c[1], c[2], c[3]);
}
Run this snippet of code. The output for me is as follows:
$ ./a.out
4 3 2 1
Those on powerPC chipsets should get '1 2 3 4' but those of us on intel chipset should see what I got above for the most part.
Maybe I'm missing something here, but are you sending strings, that is, sequences of characters? Then you don't need to worry about byte order. That is only for the bit pattern in integers. The characters in a string are always in the "right" order.
EDIT:
Derrick, to address your code example, I've run the following (slightly expanded) version of your program on an Intel i7 (little-endian) and on an old Sun Sparc (big-endian)
#include <stdio.h>
#include <stdint.h>
int main(void)
{
uint32_t a = 0x01020304;
char* c = (char*)&a;
char d[] = { 1, 2, 3, 4 };
printf("The integer: %x %x %x %xn", c[0], c[1], c[2], c[3]);
printf("The string: %x %x %x %xn", d[0], d[1], d[2], d[3]);
return 0;
}
As you can see, I've added a real char array to your print-out of an integer.
The output from the little-endian Intel i7:
The integer: 4 3 2 1
The string: 1 2 3 4
And the output from the big-endian Sun:
The integer: 1 2 3 4
The string: 1 2 3 4
Your multi-byte integer is indeed stored in different byte order on the two machines, but the characters in the char array have the same order.
With your function signature as posted you don't have to worry about byte order. It accepts a char*, that can only handle 8-bit characters. With one byte per character, you cannot have a byte order problem.
You'd only run into a byte order problem if you send Unicode, either in UTF16 or UTF32 encoding. And the endian-ness of the sending machine doesn't match the one of the receiving machine. The simple solution for that is to use UTF8 encoding. Which is what most text is sent as across networks. Being byte oriented, it doesn't have a byte order issue either. Or you could send a BOM.
If you'd like to send them as an 8-bit encoding (the fact that you're using char
implies this is what you want), there's no need to byte swap. However, for the unrelated issue of non-ASCII characters, so that the same character > 127
appears the same on both ends of the connection, I would suggest that you send the data in something like UTF-8, which can represent all unicode characters and can be safely treated as ASCII strings. The way to get UTF-8 text based on the default encoding varies by the platform and set of libraries you're using.
If you're sending 16-bit or 32-bit encoding... You can include one character with the byte order mark which the other end can use to determine the endianness of the character. Or, you can assume network byte order and use htons()
or htonl()
as you suggest. But if you'd like to use char
, please see the previous paragraph. :-)
上一篇: 在C ++中读取由Java +结构对齐打开的套接字
下一篇: 什么是转换字节的便携方式