What's the most efficient way to determine whether an untrimmed string is empty in C#?

前端 未结 8 1589

I have a string that may have whitespace characters around it and I want to check to see whether it is essentially empty.

There are quite a few ways to do this:

相关标签:
8条回答
  • 2020-12-16 00:06

    I really don't know which is faster; although my gut feeling says number one. But here's another method:

    if (String.IsNullOrEmpty(myString.Trim()))
    
    0 讨论(0)
  • 2020-12-16 00:08
    public static bool IsNullOrEmpty(this String str, bool checkTrimmed)
    {
      var b = String.IsNullOrEmpty(str);
      return checkTrimmed ? b && str.Trim().Length == 0 : b;
    }
    
    0 讨论(0)
  • 2020-12-16 00:09

    (EDIT: See bottom of post for benchmarks on different micro-optimizations of the method)

    Don't trim it - that might create a new string which you don't actually need. Instead, look through the string for any characters that aren't whitespace (for whatever definition you want). For example:

    public static bool IsEmptyOrWhitespace(string text)
    {
        // Avoid creating iterator for trivial case
        if (text.Length == 0)
        {
            return true;
        }
        foreach (char c in text)
        {
            // Could use Char.IsWhiteSpace(c) instead
            if (c==' ' || c=='\t' || c=='\r' || c=='\n')
            {
                continue;
            }
            return false;
        }
        return true;
    }
    

    You might also consider what you want the method to do if text is null.

    Possible further micro-optimizations to experiment with:

    • Is foreach faster or slower than using a for loop like the one below? Note that with the for loop you can remove the "if (text.Length==0)" test at the start.

      for (int i = 0; i < text.Length; i++)
      {
          char c = text[i];
          // ...
      
    • Same as above, but hoisting the Length call. Note that this isn't good for normal arrays, but might be useful for strings. I haven't tested it.

      int length = text.Length;
      for (int i = 0; i < length; i++)
      {
          char c = text[i];
      
    • In the body of the loop, is there any difference (in speed) between what we've got and:

      if (c != ' ' && c != '\t' && c != '\r' && c != '\n')
      {
          return false;
      }
      
    • Would a switch/case be faster?

      switch (c)
      {
          case ' ': case '\r': case '\n': case '\t':
              return false;               
      }
      

    Update on Trim behaviour

    I've just been looking into how Trim can be as efficient as this. It seems that Trim will only create a new string if it needs to. If it can return this or "" it will:

    using System;
    
    class Test
    {
        static void Main()
        {
            CheckTrim(string.Copy(""));
            CheckTrim("  ");
            CheckTrim(" x ");
            CheckTrim("xx");
        }
    
        static void CheckTrim(string text)
        {
            string trimmed = text.Trim();
            Console.WriteLine ("Text: '{0}'", text);
            Console.WriteLine ("Trimmed ref == text? {0}",
                              object.ReferenceEquals(text, trimmed));
            Console.WriteLine ("Trimmed ref == \"\"? {0}",
                              object.ReferenceEquals("", trimmed));
            Console.WriteLine();
        }
    }
    

    This means it's really important that any benchmarks in this question should use a mixture of data:

    • Empty string
    • Whitespace
    • Whitespace surrounding text
    • Text without whitespace

    Of course, the "real world" balance between these four is impossible to predict...

    Benchmarks I've run some benchmarks of the original suggestions vs mine, and mine appears to win in everything I throw at it, which surprises me given the results in other answers. However, I've also benchmarked the difference between foreach, for using text.Length, for using text.Length once and then reversing the iteration order, and for with a hoisted length.

    Basically the for loop is very slightly faster, but hoisting the length check makes it slower than foreach. Reversing the for loop direction is very slightly slower than foreach too. I strongly suspect that the JIT is doing interesting things here, in terms of removing duplicate bounds checks etc.

    Code: (see my benchmarking blog entry for the framework this is written against)

    using System;
    using BenchmarkHelper;
    
    public class TrimStrings
    {
        static void Main()
        {
            Test("");
            Test(" ");
            Test(" x ");
            Test("x");
            Test(new string('x', 1000));
            Test(" " + new string('x', 1000) + " ");
            Test(new string(' ', 1000));
        }
    
        static void Test(string text)
        {
            bool expectedResult = text.Trim().Length == 0;
            string title = string.Format("Length={0}, result={1}", text.Length, 
                                         expectedResult);
    
            var results = TestSuite.Create(title, text, expectedResult)
    /*            .Add(x => x.Trim().Length == 0, "Trim().Length == 0")
                .Add(x => x.Trim() == "", "Trim() == \"\"")
                .Add(x => x.Trim().Equals(""), "Trim().Equals(\"\")")
                .Add(x => x.Trim() == string.Empty, "Trim() == string.Empty")
                .Add(x => x.Trim().Equals(string.Empty), "Trim().Equals(string.Empty)")
    */
                .Add(OriginalIsEmptyOrWhitespace)
                .Add(IsEmptyOrWhitespaceForLoop)
                .Add(IsEmptyOrWhitespaceForLoopReversed)
                .Add(IsEmptyOrWhitespaceForLoopHoistedLength)
                .RunTests()                          
                .ScaleByBest(ScalingMode.VaryDuration);
    
            results.Display(ResultColumns.NameAndDuration | ResultColumns.Score,
                            results.FindBest());
        }
    
        public static bool OriginalIsEmptyOrWhitespace(string text)
        {
            if (text.Length == 0)
            {
                return true;
            }
            foreach (char c in text)
            {
                if (c==' ' || c=='\t' || c=='\r' || c=='\n')
                {
                    continue;
                }
                return false;
            }
            return true;
        }
    
        public static bool IsEmptyOrWhitespaceForLoop(string text)
        {
            for (int i=0; i < text.Length; i++)
            {
                char c = text[i];
                if (c==' ' || c=='\t' || c=='\r' || c=='\n')
                {
                    continue;
                }
                return false;
            }
            return true;
        }
    
        public static bool IsEmptyOrWhitespaceForLoopReversed(string text)
        {
            for (int i=text.Length-1; i >= 0; i--)
            {
                char c = text[i];
                if (c==' ' || c=='\t' || c=='\r' || c=='\n')
                {
                    continue;
                }
                return false;
            }
            return true;
        }
    
        public static bool IsEmptyOrWhitespaceForLoopHoistedLength(string text)
        {
            int length = text.Length;
            for (int i=0; i < length; i++)
            {
                char c = text[i];
                if (c==' ' || c=='\t' || c=='\r' || c=='\n')
                {
                    continue;
                }
                return false;
            }
            return true;
        }
    }
    

    Results:

    ============ Length=0, result=True ============
    OriginalIsEmptyOrWhitespace             30.012 1.00
    IsEmptyOrWhitespaceForLoop              30.802 1.03
    IsEmptyOrWhitespaceForLoopReversed      32.944 1.10
    IsEmptyOrWhitespaceForLoopHoistedLength 35.113 1.17
    
    ============ Length=1, result=True ============
    OriginalIsEmptyOrWhitespace             31.150 1.04
    IsEmptyOrWhitespaceForLoop              30.051 1.00
    IsEmptyOrWhitespaceForLoopReversed      31.602 1.05
    IsEmptyOrWhitespaceForLoopHoistedLength 33.383 1.11
    
    ============ Length=3, result=False ============
    OriginalIsEmptyOrWhitespace             30.221 1.00
    IsEmptyOrWhitespaceForLoop              30.131 1.00
    IsEmptyOrWhitespaceForLoopReversed      34.502 1.15
    IsEmptyOrWhitespaceForLoopHoistedLength 35.690 1.18
    
    ============ Length=1, result=False ============
    OriginalIsEmptyOrWhitespace             31.626 1.05
    IsEmptyOrWhitespaceForLoop              30.005 1.00
    IsEmptyOrWhitespaceForLoopReversed      32.383 1.08
    IsEmptyOrWhitespaceForLoopHoistedLength 33.666 1.12
    
    ============ Length=1000, result=False ============
    OriginalIsEmptyOrWhitespace             30.177 1.00
    IsEmptyOrWhitespaceForLoop              33.207 1.10
    IsEmptyOrWhitespaceForLoopReversed      30.867 1.02
    IsEmptyOrWhitespaceForLoopHoistedLength 31.837 1.06
    
    ============ Length=1002, result=False ============
    OriginalIsEmptyOrWhitespace             30.217 1.01
    IsEmptyOrWhitespaceForLoop              30.026 1.00
    IsEmptyOrWhitespaceForLoopReversed      34.162 1.14
    IsEmptyOrWhitespaceForLoopHoistedLength 34.860 1.16
    
    ============ Length=1000, result=True ============
    OriginalIsEmptyOrWhitespace             30.303 1.01
    IsEmptyOrWhitespaceForLoop              30.018 1.00
    IsEmptyOrWhitespaceForLoopReversed      35.475 1.18
    IsEmptyOrWhitespaceForLoopHoistedLength 40.927 1.36
    
    0 讨论(0)
  • 2020-12-16 00:10

    myString.Trim().Length == 0 Took : 421 ms

    myString.Trim() == '' took : 468 ms

    if (myString.Trim().Equals("")) Took : 515 ms

    if (myString.Trim() == String.Empty) Took : 484 ms

    if (myString.Trim().Equals(String.Empty)) Took : 500 ms

    if (string.IsNullOrEmpty(myString.Trim())) Took : 437 ms

    In my tests, it looks like myString.Trim().Length == 0 and surprisingly, string.IsNullOrEmpty(myString.Trim()) were consistently the fastest. The results above are a typical result from doing 10,000,000 comparisons.

    0 讨论(0)
  • 2020-12-16 00:10

    Since I just started I can't comment so here it is.

    if (String.IsNullOrEmpty(myString.Trim()))
    

    Trim() call will fail if myString is null since you can't call methods in a object that is null (NullReferenceException).

    So the correct syntax would be something like this:

    if (!String.IsNullOrEmpty(myString))
    {
        string trimmedString = myString.Trim();
        //do the rest of you code
    }
    else
    {
        //string is null or empty, don't bother processing it
    }
    
    0 讨论(0)
  • 2020-12-16 00:17

    Checking the length of a string for being zero is the most efficient way to test for an empty string, so I would say number 1:

    if (myString.Trim().Length == 0)
    

    The only way to optimize this further might be to avoid trimming by using a compiled regular expression (Edit: this is actually much slower than using Trim().Length).

    Edit: The suggestion to use Length came from a FxCop guideline. I've also just tested it: it's 2-3 times faster than comparing to an empty string. However both approaches are still extremely fast (we're talking nanoseconds) - so it hardly matters which one you use. Trimming is so much more of a bottleneck it's hundreds of times slower than the actual comparison at the end.

    0 讨论(0)
提交回复
热议问题