What's a portable way of converting Byte-Order of strings in C

僤鯓⒐⒋嵵緔 提交于 2019-12-04 15:36:26

Maybe I'm missing something here, but are you sending strings, that is, sequences of characters? Then you don't need to worry about byte order. That is only for the bit pattern in integers. The characters in a string are always in the "right" order.

EDIT:

Derrick, to address your code example, I've run the following (slightly expanded) version of your program on an Intel i7 (little-endian) and on an old Sun Sparc (big-endian)

#include <stdio.h>
#include <stdint.h> 

int main(void)
{
    uint32_t a = 0x01020304;
    char* c = (char*)&a;
    char d[] = { 1, 2, 3, 4 };
    printf("The integer: %x %x %x %x\n", c[0], c[1], c[2], c[3]);
    printf("The string:  %x %x %x %x\n", d[0], d[1], d[2], d[3]);
    return 0;
}

As you can see, I've added a real char array to your print-out of an integer.

The output from the little-endian Intel i7:

The integer: 4 3 2 1
The string:  1 2 3 4

And the output from the big-endian Sun:

The integer: 1 2 3 4
The string:  1 2 3 4

Your multi-byte integer is indeed stored in different byte order on the two machines, but the characters in the char array have the same order.

With your function signature as posted you don't have to worry about byte order. It accepts a char*, that can only handle 8-bit characters. With one byte per character, you cannot have a byte order problem.

You'd only run into a byte order problem if you send Unicode, either in UTF16 or UTF32 encoding. And the endian-ness of the sending machine doesn't match the one of the receiving machine. The simple solution for that is to use UTF8 encoding. Which is what most text is sent as across networks. Being byte oriented, it doesn't have a byte order issue either. Or you could send a BOM.

If you'd like to send them as an 8-bit encoding (the fact that you're using char implies this is what you want), there's no need to byte swap. However, for the unrelated issue of non-ASCII characters, so that the same character > 127 appears the same on both ends of the connection, I would suggest that you send the data in something like UTF-8, which can represent all unicode characters and can be safely treated as ASCII strings. The way to get UTF-8 text based on the default encoding varies by the platform and set of libraries you're using.

If you're sending 16-bit or 32-bit encoding... You can include one character with the byte order mark which the other end can use to determine the endianness of the character. Or, you can assume network byte order and use htons() or htonl() as you suggest. But if you'd like to use char, please see the previous paragraph. :-)

It seems to me that the function prototype doesn't match its behavior. You're passing in a char *, but you're then casting it to uint32_t *. And, looking more closely, you're casting the address of the pointer, rather than the contents, so I'm concerned that you'll get unexpected results. Perhaps the following would work better:

arr_ntoh(uint32_t* netp, uint32_t* hostp, int len)
  {
  for(i=0; i < len; i++)
    hostp[i] = ntoh(netp[i]);
  }

I'm basing this on the assumption that what you've really got is an array of uint32_t and you want to run ntoh() on all of them.

I hope this is helpful.

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!