Preferred idiom for endianess-agnostic reads

[亡魂溺海] 提交于 2019-11-29 18:37:34

问题


In the Plan 9 source code I often find code like this to read serialised data from a buffer with a well-defined endianess:

#include <stdint.h>

uint32_t le32read(uint8_t buf[static 4]) {
    return (buf[0] | buf[1] << 8 | buf[2] << 16 | buf[3] << 24);
}

I expected both gcc and clang to compile this code into something as simple as this assembly on amd64:

    .global le32read
    .type le32read,@function
le32read:
    mov (%rdi),%eax
    ret
    .size le32read,.-le32read

But contrary to my expectations, neither gcc nor clang recognize this pattern and produce complex assembly with multiple shifts instead.

Is there an idiom for this kind of operation that is both portable to all C99-implementations and produces good (i.e. like the one presented above) code across implementations?


回答1:


After some research, I found (with the help of the terrific people in ##c on Freenode), that gcc 5.0 will implement optimizations for the kind of pattern described above. In fact, it compiles the C source listed in my question to the exact assembly I listed below.

I haven't found similar information about clang, so I filed a bug report. As of Clang 9.0, clang recognises both the read as well as the write idiom and turns it into fast code.




回答2:


If you want to guaranty a conversions between a native platform order and a defined order (order on a network for example) you can let system libraries to the work and simply use the functions of <netinet/in.h> : hton, htons, htonl and ntoh, ntohs, nthol.

But I must admit that the include file is not guaranteed : under Windows I think it is winsock.h.




回答3:


You could determine endianess like in this answer. Then use the O32_HOST_ORDER macro to decide whether to cast the byte array to an uint32_t directly or to use your bit shifting expression.

#include <stdint.h>

uint32_t le32read(uint8_t buf[static 4]) {
    if (O32_HOST_ORDER == O32_LITTLE_ENDIAN) {
        return *(uint32_t *)&buf[0];
    }
    return (buf[0] | buf[1] << 8 | buf[2] << 16 | buf[3] << 24);
}


来源:https://stackoverflow.com/questions/25219621/preferred-idiom-for-endianess-agnostic-reads

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!