sorting a linked list containing strings

青春壹個敷衍的年華 提交于 2020-05-30 13:11:49

问题


So what I want to do is to sort an linked list containing only strings. To do so, I have 2 options.

Option 1 - dynamically allocate an array with the same size as the linked list and the strings containing it also with the same size, copy the contents of the linked list into the array and sort it using qsort.

Option 2 - implement a merge sort algorithm in order to sort it.

One of the problems is will it cost more memory and time if I do option 2 over option 1 or the option is the better?

My second problem is that I'm trying to do option 1 and to do so I have a header file which contains the code of the linked lists. The problem is after allocating memory for the array of strings when I try to copy the contents I get segmentation fault.

Program:

#include <stdlib.h> 
#include <stdio.h>
#include <string.h>
#include "Listas_ligadas_char.h"

int main() {
    link_char head = NULL;
    char **strings;
    head = insertEnd_char(head, "fcb");
    head = insertEnd_char(head, "bvb");
    head = insertEnd_char(head, "slb");
    head = insertEnd_char(head, "fcp");
    int len = length_char(head);
    int i = 0, j;
    strings = (char **)malloc(sizeof(char *) * len);
    link_char t;
    t = head;
    while (t != NULL && i <= len) {
        strings[i] = (char *)malloc(sizeof(char) * (strlen(t->str) + 1));
        strcpy(strings[i++], t->v.str)
        t = t->next;
    }
    for (t = head; t != NULL; t = t->next) {
        printf("* %s\n", strings[i]);
    }
}

Header file:

#ifndef _Listas_ligadas_char_
#define _Listas_ligadas_char_

#include <stdlib.h> 
#include <stdio.h>
#include <string.h>

typedef struct node_char {
    char *str;
    struct node_char *next;
} *link_char;

link_char lookup_str(link_char head, char *str) {
    link_char t;
    for (t = head; t != NULL; t = t->next)
        if (strcmp(t->str, str) == 0)
            return t;
    return NULL;
}

link_char NEW_str(char *str) {
    int i;
    link_char x = (link_char)malloc(sizeof(struct node_char));
    x->str = (char *)malloc(sizeof(char) * (strlen(str) + 1));
    strcpy(x->str, str);
    x->next = NULL;
    return x;
}

link_char insertEnd_char(link_char head, char *str) {
    link_char x;
    if (head == NULL)
        return NEW_str(str);
    for (x = head; x->next != NULL; x = x->next)
        ;
    x->next = NEW_str(str);
    return head;
}

int length_char(link_char head) {
    int count = 0;
    link_char x;
    for (x = head; x != NULL; x = x->next)
        count++;
    return count;
}

void print_lista_char(link_char head, int NL) {
    link_char t;
    for (t = head; t != NULL; t = t->next) {
        printf("%d * %s\n", NL, t->str);
    }
}

void FREEnode_str(link_char t) {
    free(t->str);
    free(t);
}

link_char delete_el_char(link_char head, char *str) {
    link_char t, prev;
    for (t = head, prev = NULL; t != NULL;
        prev = t, t = t->next) {
        if (strcmp(t->str, str) == 0) {
            if (t == head)
                head = t->next;
            else
                prev->next = t->next;
            FREEnode_str(t);
            break;
        }
    }
    return head;
}
#endif

btw if you are wondering what NL is, NL is a variable to count the respective line of the stdin and what I only want is to print the array, I don't want to keep its elements.

So if you can tell what option you think is the best I would appreciate it a lot.


回答1:


Option 1 - dynamically allocate an array with the same size as the linked list and the strings containing it also with the same size, copy the contents of the linked list into the array and sort it using qsort.

It is not necessary to convert the linked list to an array. The quicksort algorithm can also be applied to linked lists.

However, since your linked list is only singly-linked, you cannot use the (generally more efficient) Hoare partition scheme, but must use the Lomuto partition scheme instead. This is because the Hoare partition scheme requires the ability to traverse the linked list backwards (which requires a doubly-linked list).

Even if it is not necessary to convert the linked list to an array for the quicksort algorithm, this may still be meaningful, as a linked list has worse spacial locality than an array. Either way, the average time complexity of the algorithm will be O(n*log n) and the worst-case time complexity will be O(n^2).

But since your nodes only contain pointers to strings, you will have bad spacial locality anyway when dereferencing these pointers. So in this case, it may not be very helpful to convert the linked list to an array, because that would only improve the spacial locality of the pointers to the strings, but not of the strings themselves.


One of the problems is will it cost more memory and time if i do option2 over option1 or the option is the better?

Merge-sort is ideal for linked lists.

Another advantage of merge-sort is its worst-case time complexity, which is O(n*log n), whereas it is O(n^2) with quicksort.

Merge-sort has a space complexity of O(1) for linked lists, whereas quicksort has a space complexity of O(log n). However, if you decide to convert the list to an array for quicksort, the space complexity of your algorithm will increase to O(n) ).


My second problem is that im trying to do option 1 and to do so i have an header file which contains the code of the linked lists. The problem is after allocating memory for the array of strings when i try to copy the contents i get segmentation fault.

I can only help you if you provide a minimal reproducible example of your problem. The code you posted does not reproduce the problem. It does not even compile. The following line contains several errors:

strcpy(strings[i++],t->v.str)




回答2:


You indeed have 2 sensible options:

  • option 1 will usually provide the best performance but requires additional space of sizeof(link_char) * N.

  • option 2 will only require O(log(N)) stack space for pending sublists using bottom-up mergesort or similar space complexity for recursive top-down mergesort. The drawback is you have to write the sorting function yourself and it is easy to make mistakes.

Note that for option 1, you should not make a copy of the strings, but just allocate an array of pointers and initialize it to point to the nodes themselves. This way you can preserve the node structures that could contain other information and avoid extra allocations.

Note also that once you have the array of node pointers and a comparison function, you can use qsort or other sorting functions such as timsort or mergesort which may be more appropriate in terms of worst case time complexity.

There are multiple problems in your implementation:

  • the loop test while (t != NULL && i <= len) is incorrect. the tests should be redundant, but if you insist on testing i, it should be i < len or you might access beyond the end of the string array if length_char returned an incorrect count.
  • strcpy(strings[i++], t->v.str) has a syntax error, you probably mean strcpy(strings[i++], t->str);
  • the printing loop has undefined behavior because you do not reset i to 0 nor do you increment i in the loop body, so you pass strings[i] for all calls to printf and i should be len, so strings[i] accesses beyond the end of the allocated array. You might get a crash or an invalid pointer or by chance a null pointer that printf might ignore... It should be:

    for (i = 0; i < len; i++) {
        printf("* %s\n", strings[i]);
    }
    

Here is a modified version:

#include <stdio.h>
#include <stdlib.h>
#include "Listas_ligadas_char.h"

int cmp_char(const void *aa, const void *bb) {
    link_char a = *(const link_char *)aa;
    link_char b = *(const link_char *)bb;
    return strcmp(a->str, b->str);
}

link_char sort_char(link_char head) {
    if (head != NULL && head->next != NULL) {
        size_t i, len = length_char(head);
        link_char *array = malloc(sizeof(*array) * len);
        link_char t = head;
        for (i = 0; i < len; i++, t = t->next)
            array[i] = t;
        qsort(array, len, sizeof(*array), cmp_char);
        head = t = array[0];
        for (i = 1; i < len; i++)
            t = t->next = array[i];
        t->next = NULL;
        free(array);
    }
    return head;
}

int main() {
    link_char head = NULL;

    head = insertEnd_char(head, "fcb");
    head = insertEnd_char(head, "bvb");
    head = insertEnd_char(head, "slb");
    head = insertEnd_char(head, "fcp");

    head = sort_char(head);

    for (link_char t = head; t != NULL; t = t->next) {
        printf("* %s\n", strings[i]);
    }
    return 0;
}

Notes:

  • it is error prone to hide pointers behind typedefs. You should define node_char as typedef struct node_char node_char and use node_char * everywhere.
  • it is unconventional to define the list functions in the header file. You might do this for static inline functions, but the global functions should not be defined in the header file as this will cause name clashes if multiple modules include this header file and get linked together.


来源:https://stackoverflow.com/questions/61963925/sorting-a-linked-list-containing-strings

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!