An interesting C linked list idiom

I was at an interview for a C position in which they presented me with an idiom that I haven't previously encountered. This is a trick that simplifies implementation of various algorithms involving linked lists and I'm wondering if anybody else has encountered this.

Say we have a linked list record defined so:

typedef struct _record
{
    char* value;
    struct _record* next;
} record;

We need a function that inserts a new record so that the entire list remains sorted with respect to the value's in the records. The following implementation is simpler than anything I would have used, albeit less readable.

void insert_sorted(record** r, const char* value)
{
    record* newrec = NULL;
    while(*r && strcmp(value, (*r)->value) > 0)
        r = &((*r)->next); /* move r to point to the next field of the record */
    newrec = malloc(sizeof(record));
    newrec->value = strdup(value);
    newrec->next = *r;
    *r = newrec;
}

When the function is called, r points to the head pointer of the list. During the while loop, r is updated to point to the next field of the record that comes just before the point where we want to put the new record in. The last line of the function either updates the head pointer of the list (if the insertion happens at the beginning) or the next field of the previous record, which is quite cool.

A couple of questions:

Does this idiom have a name or is it mentioned in any literature?
Are there others like it in the C language?

I thought I knew C pretty well and had pointers and indirection pretty well figured out, but this one took me a while to fully understand.

I've used similar to this to insert into a binary tree. Because when iterating the tree, you usually stop when your pointer becomes NULL (you ran off the tree).

So to insert, you have 3 options,

1: use a variable which tracks the previous value of your iterating pointer.

2: stop when the pointer you would follow is NULL before you follow it, works but slightly less elegant in my opinion.

3: or a more elegant solution is simply use a pointer to a pointer, so you can just do: *it = new_node(); and it'll add it where the NULL used to be in your tree.

For a linked list, while this code works nicely, I usually just use a doubly linked list which makes it trivial to insert at any location.

I'd say the idiom is "the kind of code which gave 'c' a bad name"

Unwarrantedly clever
Unwarrantedly compact
Surprising side effects on caller
No error handling on malloc
Only works for US English strings

I don't see anything I'd call an idiom per se. It looks like standard coding for when you deal with datastructures in C.

My only complaint would be that the callers pointer (*r) is modified. Depending on the use of the function, I'd expect thats an unexpected side effect. Besides removing the unexpected side effect, using a local variable to play the role of *r would make the code more readable.

What would be the idiom here? Surely not the implementation of a linked list. The usage of a pointer to pointer construct? The compact loop?

Personally I'd use a pointer return value instead of operating on an input value. Because seeing this input data type would ring a bell, which makes me copy my value before handing it to your function.

This is a well known thing - double pointer iteration (that's my name for it, I don't know the official name). The goal is to be able to locate a position in single linked list, and then insert before that position (inserting after it is trivial). Implemented naively, this requires two pointers (current and prev) and special code for beginning of the list (when prev == NULL).

The code the way I usually write it is something along the lines of

void insertIntoSorted(Element *&head, Element *newOne)
{
  Element **pp = &head;
  Element *curr;
  while ((curr = *pp) != NULL && less(curr, newOne)) {
    pp = &(pp->next);
  }
  newOne->next = *pp;
  *pp = newOne;
}

Update:

I've forgot theother purpose for this trick - a more important one. It's used to delete elements from single linked lists:

// returns deleted element or NULL when key not found
Element *deleteFromList(Element *&head, const ElementKey &key)
{
  Element **pp = &head;
  Element *curr;
  while ((curr = *pp) != NULL && !keyMatches(curr, key)) {
    pp = &(pp->next);
  }
  if (curr == NULL) return NULL;
  *pp = (*pp)->next; // here is the actual delete
  return curr;
}

I don't know if this has a name or even if it's some special idiom but, since memory is relatively plentiful nowadays, my linked lists (where the language libraries don't make them available) are a special variant which simplifies the code greatly.

For a start, they're always doubly-linked since this makes traversal in both directions easier, for both processing and insert/delete operations.

An 'empty' list actually consists of two nodes, the head and the tail. By doing this, inserts and deletes do not need to worry about whether the node they're deleting is the head or tail, they can just assume it's a middle node.

Insertion of a new node y before node x then become a simple:

x -> prev -> next = y
y -> next = x
y -> prev = x -> prev
x -> prev = y

Deletion of node x is a simple:

x -> prev -> next = x -> next
x -> next -> prev = x -> prev
free x

Traversal is adjusted to ignore the extraneous head and tail:

n = head -> next
while n != tail
    process n
    n = n -> next

This all serves to make the code a lot easier to understand without all the special handling of the edge cases, at the cost of a couple of nodes of memory.

Instead of returning the value of the new node as an in/out parameter, you are better off having that be the return value of the function. This simplifies both the calling code, and the code inside the function (you can get rid of all those ugly double indirections).

record* insert_sorted(const record* head, const char* value)

You are missing error handling for the malloc/strdup failing case btw.

This idiom is given in the Chapter 12 of "Pointers on C".This is used to insert a node into a linked list without list head.

To answer the original question, this is known as a pointer centric approach instead of the naive node centric approach. Chapter 3 of "Advanced Programming Techniques" by Rex Barzee available at amazon.com includes a much better example implementation of the pointer centric approach.

I have also come up with this use of a double pointer, I have used it, but I don't really like it. The code that I came up with has this kernel to search for certain objects and remove them from the list:

Element** previous = &firstElement, *current;
while((current = *previous)) {
    if(shouldRemove(current)) {
        *previous = current->next;  //delete
    } else {
        previous = &current->next;  //point to next
    }
}

The reason I don't like my code is the subtle difference between the two if clauses: the syntax is almost identical, but the effect is entirely different. I do not think, we should write code as subtle as this, but doing it differently makes the code really lengthy. So, either way is bad - you might go for brevity or for readability, it's your choice.

despite of the tricks, isn't the role of variable "r" changed? how does the caller tell what "*r" is for after calling? or after execution, what is the header of the list?

I could not believe this can be exemplified (even in some book?!). Did I miss anything?

If you do not return any pointer (like the others suggested), then I would suggest following changes to keep the role of the input.

void insert_sorted(record** head, const char* value)
{
    record** r = head;
    bool isSameHead = false;
    record* newrec = NULL;
    while(*r && strcmp(value, (*r)->value) > 0) {
        r = &((*r)->next); isSameHead = true; }
    newrec = malloc(sizeof(record));
    newrec->value = strdup(value);
    newrec->next = *r;
    *r = newrec;
    if (!isSameHead) *head = newrec;
}

actually, probably another better way to do it is using the "dummy head node", which links its next to the beginning of the list.

    void insert_sorted(record** head, const char* value)
    {
        record dummyHead;
        dummyHead.next = *head;
        record* r = &dummyHead;
        while(r->next) {
           if(strcmp(value, r->next->value) < 0) 
              break;
           r = r->next;}
        newrec = malloc(sizeof(record));
        newrec->value = strdup(value);
        newrec->next = r->next;
        r->next = newrec;
        *head = dummyHead.next;
    }

来源：https://stackoverflow.com/questions/332441/an-interesting-c-linked-list-idiom

标签

idioms