sort array of integers lexicographically C++

前端 未结 12 1606
野的像风
野的像风 2021-02-02 13:53

I want to sort a large array of integers (say 1 millon elements) lexicographically.

Example:

input [] = { 100, 21 , 22 , 99 , 1  , 927 }
sorted[] = { 1           


        
12条回答
  •  青春惊慌失措
    2021-02-02 14:14

    To provide any custom sort ordering, you can provide a comparator to std::sort. In this case, it's going to be somewhat complex, using logarithms to inspect individual digits of your number in base 10.

    Here's an example — comments inline describe what's going on.

    #include 
    #include 
    #include 
    #include 
    
    int main() {
        int input[] { 100, 21, 22, 99, 1, 927, -50, -24, -160 };
    
        /**
         * Sorts the array lexicographically.
         * 
         * The trick is that we have to compare digits left-to-right
         * (considering typical Latin decimal notation) and that each of
         * two numbers to compare may have a different number of digits.
         * 
         * This is very efficient in storage space, but inefficient in
         * execution time; an approach that pre-visits each element and
         * stores a translated representation will at least double your
         * storage requirements (possibly a problem with large inputs)
         * but require only a single translation of each element.
         */
        std::sort(
            std::begin(input),
            std::end(input),
            [](int lhs, int rhs) -> bool {
                // Returns true if lhs < rhs
                // Returns false otherwise
                const auto BASE      = 10;
                const bool LHS_FIRST = true;
                const bool RHS_FIRST = false;
                const bool EQUAL     = false;
    
    
                // There's no point in doing anything at all
                // if both inputs are the same; strict-weak
                // ordering requires that we return `false`
                // in this case.
                if (lhs == rhs) {
                    return EQUAL;
                }
    
                // Compensate for sign
                if (lhs < 0 && rhs < 0) {
                    // When both are negative, sign on its own yields
                    // no clear ordering between the two arguments.
                    // 
                    // Remove the sign and continue as for positive
                    // numbers.
                    lhs *= -1;
                    rhs *= -1;
                }
                else if (lhs < 0) {
                    // When the LHS is negative but the RHS is not,
                    // consider the LHS "first" always as we wish to
                    // prioritise the leading '-'.
                    return LHS_FIRST;
                }
                else if (rhs < 0) {
                    // When the RHS is negative but the LHS is not,
                    // consider the RHS "first" always as we wish to
                    // prioritise the leading '-'.
                    return RHS_FIRST;
                }
    
                // Counting the number of digits in both the LHS and RHS
                // arguments is *almost* trivial.
                const auto lhs_digits = (
                    lhs == 0
                    ? 1
                    : std::ceil(std::log(lhs+1)/std::log(BASE))
                );
    
                const auto rhs_digits = (
                    rhs == 0
                    ? 1
                    : std::ceil(std::log(rhs+1)/std::log(BASE))
                );
    
                // Now we loop through the positions, left-to-right,
                // calculating the digit at these positions for each
                // input, and comparing them numerically. The
                // lexicographic nature of the sorting comes from the
                // fact that we are doing this per-digit comparison
                // rather than considering the input value as a whole.
                const auto max_pos = std::max(lhs_digits, rhs_digits);
                for (auto pos = 0; pos < max_pos; pos++) {
                    if (lhs_digits - pos == 0) {
                        // Ran out of digits on the LHS;
                        // prioritise the shorter input
                        return LHS_FIRST;
                    }
                    else if (rhs_digits - pos == 0) {
                        // Ran out of digits on the RHS;
                        // prioritise the shorter input
                        return RHS_FIRST;
                    }
                    else {
                        const auto lhs_x = (lhs / static_cast(std::pow(BASE, lhs_digits - 1 - pos))) % BASE;
                        const auto rhs_x = (rhs / static_cast(std::pow(BASE, rhs_digits - 1 - pos))) % BASE;
    
                        if (lhs_x < rhs_x)
                            return LHS_FIRST;
                        else if (rhs_x < lhs_x)
                            return RHS_FIRST;
                    }
                }
    
                // If we reached the end and everything still
                // matches up, then something probably went wrong
                // as I'd have expected to catch this in the tests
                // for equality.
                assert("Unknown case encountered");
            }
        );
    
        std::cout << '{';
        for (auto x : input)
            std::cout << x << ", ";
        std::cout << '}';
    
        // Output: {-160, -24, -50, 1, 100, 21, 22, 927, 99, }
    }
    

    Demo

    There are quicker ways to calculate the number of digits in a number, but the above will get you started.

提交回复
热议问题