fast custom string splitting

后端 未结 3 1071
离开以前
离开以前 2021-01-06 17:25

I am writing a custom string split. It will split on a dot(.) that is not preceded by an odd number of backslashes (\\).

«string» -         


        
3条回答
  •  萌比男神i
    2021-01-06 18:09

    You shouldn't try to use string.Split for that.

    If you need help to implement it, a simple way to solve this is to have loop that scans the string, keeping track of the last place where you found a qualifying dot. When you find a new qualifying dot (or reach the end of the input string), just yield return the current substring.

    Edit: about returning a list or an array vs. using yield

    If in your application, the most important thing is the time spent by the caller on iterating the substrings, then you should populate a list or an array and return that, as suggested in the accepted question. I would not use a resizable array while collecting the substrings because this would be slow.

    On the other hand, if you care about the overall performance and about memory, and if sometimes the caller doesn't have to iterate over the entire list, you should use yield return. When you use yield return, you have the advantage that no code at all is executing until the caller has called MoveNext (directly or indirectly through a foreach). This means that you save the memory for allocating the array/list, and you save the time spent on allocating/resizing/populating the list. You will be spending time almost only on the logic of finding the substrings, and this will be done lazily, that is - only when actually needed because the caller continues to iterate the substrings.

提交回复
热议问题