Sort on a string that may contain a number

前端 未结 23 2316
走了就别回头了
走了就别回头了 2020-11-22 02:59

I need to write a Java Comparator class that compares Strings, however with one twist. If the two strings it is comparing are the same at the beginning and end of the strin

23条回答
  •  刺人心
    刺人心 (楼主)
    2020-11-22 03:28

    Adding on to the answer made by @stanislav. A few problems I faced while using the answer provided was:

    1. Capital and small letters are separated by the characters between their ASCII codes. This breaks the flow when the strings being sorted have _ or other characters which are between small letters and capital letters in ASCII.
    2. If two strings are the same except for the leading zeroes count being different, the function returns 0 which will make the sort depend on the original positions of the string in the list.

    These two issues have been fixed in the new code. And I made a few function instead of a few repetitive set of code. The differentCaseCompared variable keeps track of whether if two strings are the same except for the cases being different. If so the value of the first different case characters subtracted is returned. This is done to avoid the issue of having two strings differing by case returned as 0.

    
    public class NaturalSortingComparator implements Comparator {
    
        @Override
        public int compare(String string1, String string2) {
            int lengthOfString1 = string1.length();
            int lengthOfString2 = string2.length();
            int iteratorOfString1 = 0;
            int iteratorOfString2 = 0;
            int differentCaseCompared = 0;
            while (true) {
                if (iteratorOfString1 == lengthOfString1) {
                    if (iteratorOfString2 == lengthOfString2) {
                        if (lengthOfString1 == lengthOfString2) {
                            // If both strings are the same except for the different cases, the differentCaseCompared will be returned
                            return differentCaseCompared;
                        }
                        //If the characters are the same at the point, returns the difference between length of the strings
                        else {
                            return lengthOfString1 - lengthOfString2;
                        }
                    }
                    //If String2 is bigger than String1
                    else
                        return -1;
                }
                //Check if String1 is bigger than string2
                if (iteratorOfString2 == lengthOfString2) {
                    return 1;
                }
    
                char ch1 = string1.charAt(iteratorOfString1);
                char ch2 = string2.charAt(iteratorOfString2);
    
                if (Character.isDigit(ch1) && Character.isDigit(ch2)) {
                    // skip leading zeros
                    iteratorOfString1 = skipLeadingZeroes(string1, lengthOfString1, iteratorOfString1);
                    iteratorOfString2 = skipLeadingZeroes(string2, lengthOfString2, iteratorOfString2);
    
                    // find the ends of the numbers
                    int endPositionOfNumbersInString1 = findEndPositionOfNumber(string1, lengthOfString1, iteratorOfString1);
                    int endPositionOfNumbersInString2 = findEndPositionOfNumber(string2, lengthOfString2, iteratorOfString2);
    
                    int lengthOfDigitsInString1 = endPositionOfNumbersInString1 - iteratorOfString1;
                    int lengthOfDigitsInString2 = endPositionOfNumbersInString2 - iteratorOfString2;
    
                    // if the lengths are different, then the longer number is bigger
                    if (lengthOfDigitsInString1 != lengthOfDigitsInString2)
                        return lengthOfDigitsInString1 - lengthOfDigitsInString2;
    
                    // compare numbers digit by digit
                    while (iteratorOfString1 < endPositionOfNumbersInString1) {
    
                        if (string1.charAt(iteratorOfString1) != string2.charAt(iteratorOfString2))
                            return string1.charAt(iteratorOfString1) - string2.charAt(iteratorOfString2);
    
                        iteratorOfString1++;
                        iteratorOfString2++;
                    }
                } else {
                    // plain characters comparison
                    if (ch1 != ch2) {
                        if (!ignoreCharacterCaseEquals(ch1, ch2))
                            return Character.toLowerCase(ch1) - Character.toLowerCase(ch2);
    
                        // Set a differentCaseCompared if the characters being compared are different case.
                        // Should be done only once, hence the check with 0
                        if (differentCaseCompared == 0) {
                            differentCaseCompared = ch1 - ch2;
                        }
                    }
    
                    iteratorOfString1++;
                    iteratorOfString2++;
                }
            }
        }
    
        private boolean ignoreCharacterCaseEquals(char character1, char character2) {
    
            return Character.toLowerCase(character1) == Character.toLowerCase(character2);
        }
    
        private int findEndPositionOfNumber(String string, int lengthOfString, int end) {
    
            while (end < lengthOfString && Character.isDigit(string.charAt(end)))
                end++;
    
            return end;
        }
    
        private int skipLeadingZeroes(String string, int lengthOfString, int iteratorOfString) {
    
            while (iteratorOfString < lengthOfString && string.charAt(iteratorOfString) == '0')
                iteratorOfString++;
    
            return iteratorOfString;
        }
    }
    

    The following is a unit test I used.

    
    public class NaturalSortingComparatorTest {
    
        private int NUMBER_OF_TEST_CASES = 100000;
    
        @Test
        public void compare() {
    
            NaturalSortingComparator naturalSortingComparator = new NaturalSortingComparator();
    
            List expectedStringList = getCorrectStringList();
            List testListOfStrings = createTestListOfStrings();
            runTestCases(expectedStringList, testListOfStrings, NUMBER_OF_TEST_CASES, naturalSortingComparator);
    
        }
    
        private void runTestCases(List expectedStringList, List testListOfStrings,
                                  int numberOfTestCases, Comparator comparator) {
    
            for (int testCase = 0; testCase < numberOfTestCases; testCase++) {
                Collections.shuffle(testListOfStrings);
                testListOfStrings.sort(comparator);
                Assert.assertEquals(expectedStringList, testListOfStrings);
            }
        }
    
        private List getCorrectStringList() {
            return Arrays.asList(
                    "1", "01", "001", "2", "02", "10", "10", "010",
                    "20", "100", "_1", "_01", "_2", "_200", "A 02",
                    "A01", "a2", "A20", "t1A", "t1a", "t1AB", "t1Ab",
                    "t1aB", "t1ab", "T010T01", "T0010T01");
        }
    
        private List createTestListOfStrings() {
            return Arrays.asList(
                    "10", "20", "A20", "2", "t1ab", "01", "T010T01", "t1aB",
                    "_2", "001", "_200", "1", "A 02", "t1Ab", "a2", "_1", "t1A", "_01",
                    "100", "02", "T0010T01", "t1AB", "10", "A01", "010", "t1a");
        }
    
    }
    

    Suggestions welcome! I am not sure whether adding the functions changes anything other than the readability part of things.

    P.S: Sorry to add another answer to this question. But I don't have enough reps to comment on the answer which I modified for my use.

提交回复
热议问题