Split a string ignoring quoted sections

前端 未结 13 2408
别跟我提以往
别跟我提以往 2020-12-06 00:15

Given a string like this:

a,\"string, with\",various,\"values, and some\",quoted

What is a good algorithm to split this based on

13条回答
  •  夕颜
    夕颜 (楼主)
    2020-12-06 00:41

    Since you said language agnostic, I wrote my algorithm in the language that's closest to pseudocode as posible:

    def find_character_indices(s, ch):
        return [i for i, ltr in enumerate(s) if ltr == ch]
    
    
    def split_text_preserving_quotes(content, include_quotes=False):
        quote_indices = find_character_indices(content, '"')
    
        output = content[:quote_indices[0]].split()
    
        for i in range(1, len(quote_indices)):
            if i % 2 == 1: # end of quoted sequence
                start = quote_indices[i - 1]
                end = quote_indices[i] + 1
                output.extend([content[start:end]])
    
            else:
                start = quote_indices[i - 1] + 1
                end = quote_indices[i]
                split_section = content[start:end].split()
                output.extend(split_section)
    
            output += content[quote_indices[-1] + 1:].split()                                                                 
    
        return output
    

提交回复
热议问题