What is the best algorithm for arbitrary delimiter/escape character processing?

后端 未结 7 1890
心在旅途
心在旅途 2021-01-02 07:57

I\'m a little surprised that there isn\'t some information on this on the web, and I keep finding that the problem is a little stickier than I thought.

Here\'s the r

7条回答
  •  旧巷少年郎
    2021-01-02 08:15

    Here's a more idiomatic and readable way to do it:

    public IEnumerable SplitAndUnescape(
        string encodedString,
        char separator,
        char escape)
    {
        var inEscapeSequence = false;
        var currentToken = new StringBuilder();
    
        foreach (var currentCharacter in encodedString)
            if (inEscapeSequence)
            {
                currentToken.Append(currentCharacter);
                inEscapeSequence = false;
            }
            else
                if (currentCharacter == escape)
                    inEscapeSequence = true;
                else
                    if (currentCharacter == separator)
                    {
                        yield return currentToken.ToString();
                        currentToken.Clear();
                    }
                    else
                        currentToken.Append(currentCharacter);
    
        yield return currentToken.ToString();
    }
    

    Note that this doesn't remove empty elements. I don't think that should be the responsibility of the parser. If you want to remove them, just call Where(item => item.Any()) on the result.

    I think this is too much logic for a single method; it gets hard to follow. If someone has time, I think it would be better to break it up into multiple methods and maybe its own class.

提交回复
热议问题