Replace a list of invalid character with their valid version (like tr)

后端 未结 4 1221
你的背包
你的背包 2021-01-05 18:26

I need to do something like this dreamed .trReplace:

  str = str.trReplace(\"áéíüñ\",\"aeiu&\");

It should change this str

4条回答
  •  夕颜
    夕颜 (楼主)
    2021-01-05 19:14

    Richard has a good answer, but performance may suffer slightly on longer strings (about 25% slower than straight string replace as shown in question). I felt complelled to look in to this a little further. There are actually several good related answers already on StackOverflow as captured below:

    Fastest way to remove chars from string

    C# Stripping / converting one or more characters

    There is also a good article on the CodeProject covering the different options.

    http://www.codeproject.com/KB/string/fastestcscaseinsstringrep.aspx

    To explain why the function provided in Richards answer gets slower with longer strings is due to the fact that the replacements are happening one character at a time; thus if you have large sequences of non-mapped characters, you are wasting extra cycles while re-appending together the string . As such, if you want to take a few points from the CodePlex Article you end up with a slightly modified version of Richards answer that looks like:

    private static readonly Char[] ReplacementChars = new[] { 'á', 'é', 'í', 'ü', 'ñ' };
    private static readonly Dictionary ReplacementMappings = new Dictionary
                                                                   {
                                                                     { 'á', 'a'},
                                                                     { 'é', 'e'},
                                                                     { 'í', 'i'},
                                                                     { 'ü', 'u'},
                                                                     { 'ñ', '&'}
                                                                   };
    
    private static string Translate(String source)
    {
      var startIndex = 0;
      var currentIndex = 0;
      var result = new StringBuilder(source.Length);
    
      while ((currentIndex = source.IndexOfAny(ReplacementChars, startIndex)) != -1)
      {
        result.Append(source.Substring(startIndex, currentIndex - startIndex));
        result.Append(ReplacementMappings[source[currentIndex]]);
    
        startIndex = currentIndex + 1;
      }
    
      if (startIndex == 0)
        return source;
    
      result.Append(source.Substring(startIndex));
    
      return result.ToString();
    }
    

    NOTE Not all edge cases have been tested.

    NOTE Could replace ReplacementChars with ReplacementMappings.Keys.ToArray() for a slight cost.

    Assuming that NOT every character is a replacement char, then this will actually run slightly faster than straigt string replacements (again about 20%).

    That being said, remember when considering performance cost, what we are actually talking about... in this case... the difference between the optimized solution and original solution is about 1 second over 100,000 iterations on a 1,000 character string.

    Either way, just wanted to add some information to the answers for this question.

提交回复
热议问题