How to sort an array of string alphabetically (case sensitive, nonstandard collation)

前端 未结 6 2003
北荒
北荒 2020-12-02 19:15

I need a c language code to sort some strings and it should be case sensitive and for the same letter in upper- and lower-cases, the lower-case must come first

6条回答
  •  庸人自扰
    2020-12-02 19:43

    The key of the OP code is the use of function strcmp() to compare two strings.
    So, I will start by replacing this standard function by another one, like the following:

      // We assume that the collating sequence satisfies the following rules:
      // 'A' < 'B' < 'C' < ...
      // 'a' < 'b' < 'c' < ...
      // We don't make any other assumptions.
    
      #include       
      int my_strcmp(const char * s1, const char * s2)
      {
          const char *p1 = s1, *p2 = s2;
          while(*p1 == *p2 && *p1 != '\0' && *p2 != '\0')
              p1++, p2++;  /* keep searching... */
    
          if (*p1 == *p2)
             return 0;
          if (*p1 == '\0')
             return -1;
          if (*p2 == '\0')
             return +1;
    
          char c1 = tolower(*p1),      c2 = tolower(*p2);
          int  u1 = isupper(*p1) != 0, u2 = isupper(*p2) != 0;
          if (c1 != c2)
            return c1 - c2;  // <<--- Alphabetical order assumption is used here 
          if (c1 == c2)
            return u1 - u2;
      }
    

    The last lines can be compacted in this way:

         return (c1 != c2)? c1 - c2: u1 - u2;
    

    Now, by replacing strcmp() by my_strcmp() you will have the desired result.

    In an sort algorithm it's good idea to think separately the 3 main aspects of it:

    • The comparisson function.
    • The abstract sort algorithm that we will use.
    • The way in that data will be "moved" in the array when two items have to be swapped.

    These aspects can be optimized independently.
    Thus, for exampmle, once you have the comparisson function well settled, the next optimization step could be to replace the double for sorting algorithm by a more efficient one, like quicksort.
    In particular, the function qsort() of the standard library provides you with such an algorithm, so you don't need to care about programming it.
    Finally, the strategy you use to store the array information could have consequences in performance.
    It would be more efficient to store strings like "array of pointers to char" instead of "array of array of char", since swapping pointers is faster than swapping two entire arrays of chars.

    Arrays of pointers

    ADDITIONAL NOTE: The three first if()'s are actually redundant, because the logic of the following sentences implies the desired result in the case that *p1 or *p2 is 0. However, by keeping those if()'s, the code becomes more readable.

提交回复
热议问题