What means of this code in C qsort?

问题

void qsort (void *a, size_t n, size_t es, int (*compare)(const void *, const void *)

where a is a start of array address, n is sizeof array, es is sizeof array element.

I read the source code of qsort in C that I can't understand. the code is as follows.

#define SWAPINT(a,es) swaptype = ((char*)a- (char*)0 % sizeof(long) || \
        es % sizeof(long) ? 2: es == sizeof(long)? 0 : 1

I interpret this macro by,

if(((char*)a- (char*)0)% sizeof(long))==1 || es % sizeof(long)==1)
     swaptype = 2;
else if(es== sizeof(long))
     swaptype = 0;
else
     swaptype = 1;

But I don't understand why type conversion is implemented, (char*)a.

And what means of this line?

(char*)a- (char*)0)% sizeof(long)==1

回答1:

Wherever you found that code, you probably copied it incorrectly. I found some very similar code in libutil from Canu:

c.swaptype = ((char *)a - (char *)0) % sizeof(long) || \
  es % sizeof(long) ? 2 : es == sizeof(long)? 0 : 1;

This code was likely illegitimally (because the terms of the copyright license are violated) copied from FreeBSD's libc:

//__FBSDID("$FreeBSD: src/lib/libc/stdlib/qsort.c,v 1.12 2002/09/10 02:04:49 wollman Exp $");

So I'm guessing you got it from a *BSD libc implementation. Indeedd FreeBSD's quicksort implementation contains the SWAPINIT macro (not SWAPINT):

#define SWAPINIT(TYPE, a, es) swaptype_ ## TYPE =       \
        ((char *)a - (char *)0) % sizeof(TYPE) ||       \
        es % sizeof(TYPE) ? 2 : es == sizeof(TYPE) ? 0 : 1;

After parsing, you should find that the above code is roughly the same as

condition_one   = ((char *)a - (char *)0) % sizeof(long);
condition_two   = es %  sizeof(long);
condition_three = es == sizeof(long);
c.swaptype = (condition_one || condition_two) ? 2 : condition_three ? 0 : 1;

Note that condition_two, as a condition, is not the same as es % sizeof(long) == 1, but rather es % sizeof(long) != 0. Aside from that, your translation was correct.

The intent of these conditions seems to be as follows:

condition_one is true when a is not long-aligned.
condition_two is true when es is not a multiple of long.
condition_three is true when es is exactly long.

As a result,

swaptype == 2 is when you don't have enough guarantees about the elements to be clever about swapping,
swaptype == 1 is intended for arrays with elements that are aligned along long boundaries (note: but not necessarily aligned as longs!), and
swaptype == 0 is intended for arrays that match the previous description, that also have elements that are also long-sized.

There is explicit type conversion in this case, because a has type void*, for which type arithmetic is undefined. However, also note that ((char *)a - (char *)0) is undefined too:

When two pointers are subtracted, both shall point to elements of the same array object, or one past the last element of the array object; the result is the difference of the subscripts of the two array elements.

(C11 draft N1570, section 6.5.6, clause 9 on pages 93 and 94.)

It's not exactly spelled out in C11, but the null pointer is not part of the same array as the object pointed to by a, so the basic rules for pointer arithmetic are violated, so the behaviour is undefined.

回答2:

The macros is trying to check for alignment portably in a language, C, which doesn't really allow for such a test. So we subtract the null pointer from our pointer to obtain an integer, then take modulus the size of a long. If the result is zero, the data is long-aligned and we can access as longs. If it is not, we can try some other scheme.

回答3:

As remarked in the comments, the macro definition you present does not expand to valid C code because it involves computing (char*)0 % sizeof(long), where the left-hand operand of the % has type char *. That is not an integer type, but both operands of % are required to have integer type.

Additionally, the macro's expansion has unbalanced parentheses. That's not inherently wrong, but it makes that macro tricky to use. Furthermore, even where operator precedence yields a sensible result, usage of parentheses and extra whitespace can aid human interpretation of the code, at no penalty to execution speed, and negligible extra compilation cost.

So, I think the desired macro would be more like this:

#define SWAPINT(a,es) swaptype = (                                  \
    ((((char*)a - (char*)0) % sizeof(long)) || (es % sizeof(long))) \
        ? 2                                                         \
        : ((es == sizeof(long)) ? 0 : 1))                           \
)

I'd consider instead writing the penultimate line as

        : (es != sizeof(long))

to reduce the complexity of the expression at a slight cost to its comprehensibility. In any event, the intent appears to be to set swaptype to:

2 if a is not aligned on an n-byte boundary, where n is the number of bytes in a long, or if es is not an integer multiple of the size of a long; otherwise
1 if es is unequal to the size of a long; otherwise
0

That's similar, but not identical, to your interpretation. Note, however, that even this code has undefined behavior because of (char*)a - (char*)0. Evaluating that difference has defined behavior only if both pointers point into, or just past the end of, the same object, and (char *)0 does not point (in)to or just past the end of any object.

You asked specifically:

But I don't understand why type conversion is implemented, (char*)a.

That is performed because pointer arithmetic is defined in terms of the pointed-to type, so (1), a conforming program cannot perform arithmetic with a void *, and (2) the code wants the result of the subtraction to be in the same units as the result of the sizeof operator (bytes).

And what means of this line?
(char*)a- (char*)0)% sizeof(long)==1

That line does not appear in the macro you presented, and it is not a complete expression because of unbalanced parentheses. It appears to be trying to determine whether a points one past an n-byte boundary, where n is as defined above, but again, evaluating the pointer difference has undefined behavior. Note also that for an integer x, x % sizeof(long) == 1 evaluated in boolean context has different meaning than x % sizeof(long) evaluated in the same context. The latter makes more sense in the context you described.

来源：https://stackoverflow.com/questions/39835354/what-means-of-this-code-in-c-qsort

标签

qsort