Is it technically impossible to implement memcpy from scratch in Standard C?

前端 未结 2 592
独厮守ぢ
独厮守ぢ 2021-02-19 17:23

Howard Chu writes:

In the latest C spec it is impossible to write a \"legal\" implementation of malloc or memcpy.

Is this right? My

相关标签:
2条回答
  • 2021-02-19 17:42

    For the malloc function, paragraph 6.5 §6 makes it clear that it is not possible to write a conformant and portable C implementation :

    The effective type of an object for an access to its stored value is the declared type of the object, if any(87)...

    The (non normative) note 87 says:

    Allocated objects have no declared type.

    The only way to declare a object with no declared type is... through the allocation function which is required to return such an object! So inside the allocation function, you must have something that cannot be allowed by the standard to setup a memory zone with no declared type.

    In common implementations, the standard library malloc and free are indeed implemented in C, but the system knows about it and assumes that the character array which has been provided inside malloc just has no declared type. Full stop.

    But the remaining part of the same paragraph explains that there is no real problem in writing a memcpy implementation (emphasize mine):

    ... If a value is stored into an object having no declared type through an lvalue having a type that is not a character type, then the type of the lvalue becomes the effective type of the object for that access and for subsequent accesses that do not modify the stored value. If a value is copied into an object having no declared type using memcpy or memmove, or is copied as an array of character type, then the effective type of the modified object for that access and for subsequent accesses that do not modify the value is the effective type of the object from which the value is copied, if it has one. For all other accesses to an object having no declared type, the effective type of the object is simply the type of the lvalue used for the access.

    Provided you copy the object as an array of character type, which is a special access allowed per the strict aliasing rule, there is no problem in implementing memcpy, and your code is a possible and valid implementation.

    IMHO the rant of Howard Chu is about that old good memcpy usage, which is no longer valid (assuming sizeof(float) == sizeof(int)):

    float f = 1.0;
    int i;
    memcpy(&i, &f, sizeof(int));         // valid: copy at byte level, but the value of i is undefined
    print("Repr of %f is %x\n", i, i);   // UB: i cannot be accessed as a float
    
    0 讨论(0)
  • 2021-02-19 17:47

    TL;DR
    It should be fine, as long as the memcpy is based on naive character-by-character copy.

    And not optimized to move chunks of the size of the largest aligned type that can be copied in a single instruction. The latter is how standard lib implementations do it.


    What's concerning is something like this scenario:

    void* my_int = malloc(sizeof *my_int);
    int another_int = 1;
    
    my_memcpy(my_int, &another_int, sizeof(int));
    
    printf("%d", *(int*)my_int); // well-defined or strict aliasing violation?
    

    Explanation:

    • The data pointed at my my_int has no effective type.
    • When we copy the data into the my_int location, one might be concerned that we force the effective type to become unsigned char, since that's what my_memcpy uses.
    • And then when we read that memory location through int*. Would we violate strict aliasing?

    However, the key here is a special exception in the rule for effective type, specified in C17 6.5/6, emphasis mine:

    If a value is copied into an object having no declared type using memcpy or memmove, or is copied as an array of character type, then the effective type of the modified object for that access and for subsequent accesses that do not modify the value is the effective type of the object from which the value is copied, if it has one.

    Since we do copy the array as character type, the effective type of what my_int points at will become that of the object another_int from which the value was copied.

    So everything should be fine.

    In addition, you restrict-qualified the parameters so there should be no fuss regarding if the two pointers might alias each other, just like real memcpy.

    Notably, this rule has remained the same through C99, C11 and C17. One might argue that it is a very bad rule abused by compiler vendors, but that's another story.

    0 讨论(0)
提交回复
热议问题