sort array of integers lexicographically C++

前端未结

关注

 12  1695

野的像风 2021-02-02 13:53

I want to sort a large array of integers (say 1 millon elements) lexicographically.

Example:

input [] = { 100, 21 , 22 , 99 , 1  , 927 }
sorted[] = { 1


      
      
        
          12条回答        

        
                    
            
            
                         
                
              
              
                
                   青春惊慌失措
                                             
                
                
                (楼主)
            
              
              
                2021-02-02 14:14
              

            
            
                        
To provide any custom sort ordering, you can provide a comparator to std::sort. In this case, it's going to be somewhat complex, using logarithms to inspect individual digits of your number in base 10.

Here's an example — comments inline describe what's going on.

#include 
#include 
#include 
#include 

int main() {
    int input[] { 100, 21, 22, 99, 1, 927, -50, -24, -160 };

    /**
     * Sorts the array lexicographically.
     * 
     * The trick is that we have to compare digits left-to-right
     * (considering typical Latin decimal notation) and that each of
     * two numbers to compare may have a different number of digits.
     * 
     * This is very efficient in storage space, but inefficient in
     * execution time; an approach that pre-visits each element and
     * stores a translated representation will at least double your
     * storage requirements (possibly a problem with large inputs)
     * but require only a single translation of each element.
     */
    std::sort(
        std::begin(input),
        std::end(input),
        [](int lhs, int rhs) -> bool {
            // Returns true if lhs < rhs
            // Returns false otherwise
            const auto BASE      = 10;
            const bool LHS_FIRST = true;
            const bool RHS_FIRST = false;
            const bool EQUAL     = false;


            // There's no point in doing anything at all
            // if both inputs are the same; strict-weak
            // ordering requires that we return `false`
            // in this case.
            if (lhs == rhs) {
                return EQUAL;
            }

            // Compensate for sign
            if (lhs < 0 && rhs < 0) {
                // When both are negative, sign on its own yields
                // no clear ordering between the two arguments.
                // 
                // Remove the sign and continue as for positive
                // numbers.
                lhs *= -1;
                rhs *= -1;
            }
            else if (lhs < 0) {
                // When the LHS is negative but the RHS is not,
                // consider the LHS "first" always as we wish to
                // prioritise the leading '-'.
                return LHS_FIRST;
            }
            else if (rhs < 0) {
                // When the RHS is negative but the LHS is not,
                // consider the RHS "first" always as we wish to
                // prioritise the leading '-'.
                return RHS_FIRST;
            }

            // Counting the number of digits in both the LHS and RHS
            // arguments is *almost* trivial.
            const auto lhs_digits = (
                lhs == 0
                ? 1
                : std::ceil(std::log(lhs+1)/std::log(BASE))
            );

            const auto rhs_digits = (
                rhs == 0
                ? 1
                : std::ceil(std::log(rhs+1)/std::log(BASE))
            );

            // Now we loop through the positions, left-to-right,
            // calculating the digit at these positions for each
            // input, and comparing them numerically. The
            // lexicographic nature of the sorting comes from the
            // fact that we are doing this per-digit comparison
            // rather than considering the input value as a whole.
            const auto max_pos = std::max(lhs_digits, rhs_digits);
            for (auto pos = 0; pos < max_pos; pos++) {
                if (lhs_digits - pos == 0) {
                    // Ran out of digits on the LHS;
                    // prioritise the shorter input
                    return LHS_FIRST;
                }
                else if (rhs_digits - pos == 0) {
                    // Ran out of digits on the RHS;
                    // prioritise the shorter input
                    return RHS_FIRST;
                }
                else {
                    const auto lhs_x = (lhs / static_cast(std::pow(BASE, lhs_digits - 1 - pos))) % BASE;
                    const auto rhs_x = (rhs / static_cast(std::pow(BASE, rhs_digits - 1 - pos))) % BASE;

                    if (lhs_x < rhs_x)
                        return LHS_FIRST;
                    else if (rhs_x < lhs_x)
                        return RHS_FIRST;
                }
            }

            // If we reached the end and everything still
            // matches up, then something probably went wrong
            // as I'd have expected to catch this in the tests
            // for equality.
            assert("Unknown case encountered");
        }
    );

    std::cout << '{';
    for (auto x : input)
        std::cout << x << ", ";
    std::cout << '}';

    // Output: {-160, -24, -50, 1, 100, 21, 22, 927, 99, }
}


Demo

There are quicker ways to calculate the number of digits in a number, but the above will get you started.
    
             
                                                        
            
            
              
                
                0
              
                   
                
               讨论(0)
              
                                                  
              
              
                          
             
       
          
              
                                       
     查看其它12个回答


            
                         
                    


               
            
    发布评论:
    
         
                        
    
    提交评论 
  
  

                    
                    
                    
                        
                        
                         加载中...
                        
                    
                
          
                              			
        
        
        
          
            
            
              
              
            
    


                                 
              
            
                          
    

        
         
                验证码
                
                  
                
                
                   看不清?
                
              
                                  
                    
   
                 
             
              提交回复