Python split string without splitting escaped character

后端 未结 10 1396
谎友^
谎友^ 2020-12-08 20:56

Is there a way to split a string without splitting escaped character? For example, I have a string and want to split by \':\' and not by \'\\:\'

http\\://ww         


        
10条回答
  •  生来不讨喜
    2020-12-08 21:38

    I have created this method, which is inspired by Henry Keiter's answer, but has the following advantages:

    • Variable escape character and delimiter
    • Do not remove the escape character if it is actually not escaping something

    This is the code:

    def _split_string(self, string: str, delimiter: str, escape: str) -> [str]:
        result = []
        current_element = []
        iterator = iter(string)
        for character in iterator:
            if character == self.release_indicator:
                try:
                    next_character = next(iterator)
                    if next_character != delimiter and next_character != escape:
                        # Do not copy the escape character if it is inteded to escape either the delimiter or the
                        # escape character itself. Copy the escape character if it is not in use to escape one of these
                        # characters.
                        current_element.append(escape)
                    current_element.append(next_character)
                except StopIteration:
                    current_element.append(escape)
            elif character == delimiter:
                # split! (add current to the list and reset it)
                result.append(''.join(current_element))
                current_element = []
            else:
                current_element.append(character)
        result.append(''.join(current_element))
        return result
    

    This is test code indicating the behavior:

    def test_split_string(self):
        # Verify normal behavior
        self.assertListEqual(['A', 'B'], list(self.sut._split_string('A+B', '+', '?')))
    
        # Verify that escape character escapes the delimiter
        self.assertListEqual(['A+B'], list(self.sut._split_string('A?+B', '+', '?')))
    
        # Verify that the escape character escapes the escape character
        self.assertListEqual(['A?', 'B'], list(self.sut._split_string('A??+B', '+', '?')))
    
        # Verify that the escape character is just copied if it doesn't escape the delimiter or escape character
        self.assertListEqual(['A?+B'], list(self.sut._split_string('A?+B', '\'', '?')))
    

提交回复
热议问题