I have a string that may have whitespace characters around it and I want to check to see whether it is essentially empty.
There are quite a few ways to do this:
I really don't know which is faster; although my gut feeling says number one. But here's another method:
if (String.IsNullOrEmpty(myString.Trim()))
public static bool IsNullOrEmpty(this String str, bool checkTrimmed)
{
var b = String.IsNullOrEmpty(str);
return checkTrimmed ? b && str.Trim().Length == 0 : b;
}
(EDIT: See bottom of post for benchmarks on different micro-optimizations of the method)
Don't trim it - that might create a new string which you don't actually need. Instead, look through the string for any characters that aren't whitespace (for whatever definition you want). For example:
public static bool IsEmptyOrWhitespace(string text)
{
// Avoid creating iterator for trivial case
if (text.Length == 0)
{
return true;
}
foreach (char c in text)
{
// Could use Char.IsWhiteSpace(c) instead
if (c==' ' || c=='\t' || c=='\r' || c=='\n')
{
continue;
}
return false;
}
return true;
}
You might also consider what you want the method to do if text
is null
.
Possible further micro-optimizations to experiment with:
Is foreach
faster or slower than using a for
loop like the one below? Note that with the for
loop you can remove the "if (text.Length==0)
" test at the start.
for (int i = 0; i < text.Length; i++)
{
char c = text[i];
// ...
Same as above, but hoisting the Length
call. Note that this isn't good for normal arrays, but might be useful for strings. I haven't tested it.
int length = text.Length;
for (int i = 0; i < length; i++)
{
char c = text[i];
In the body of the loop, is there any difference (in speed) between what we've got and:
if (c != ' ' && c != '\t' && c != '\r' && c != '\n')
{
return false;
}
Would a switch/case be faster?
switch (c)
{
case ' ': case '\r': case '\n': case '\t':
return false;
}
Update on Trim behaviour
I've just been looking into how Trim
can be as efficient as this. It seems that Trim
will only create a new string if it needs to. If it can return this
or ""
it will:
using System;
class Test
{
static void Main()
{
CheckTrim(string.Copy(""));
CheckTrim(" ");
CheckTrim(" x ");
CheckTrim("xx");
}
static void CheckTrim(string text)
{
string trimmed = text.Trim();
Console.WriteLine ("Text: '{0}'", text);
Console.WriteLine ("Trimmed ref == text? {0}",
object.ReferenceEquals(text, trimmed));
Console.WriteLine ("Trimmed ref == \"\"? {0}",
object.ReferenceEquals("", trimmed));
Console.WriteLine();
}
}
This means it's really important that any benchmarks in this question should use a mixture of data:
Of course, the "real world" balance between these four is impossible to predict...
Benchmarks
I've run some benchmarks of the original suggestions vs mine, and mine appears to win in everything I throw at it, which surprises me given the results in other answers. However, I've also benchmarked the difference between foreach
, for
using text.Length
, for
using text.Length
once and then reversing the iteration order, and for
with a hoisted length.
Basically the for
loop is very slightly faster, but hoisting the length check makes it slower than foreach
. Reversing the for
loop direction is very slightly slower than foreach
too. I strongly suspect that the JIT is doing interesting things here, in terms of removing duplicate bounds checks etc.
Code: (see my benchmarking blog entry for the framework this is written against)
using System;
using BenchmarkHelper;
public class TrimStrings
{
static void Main()
{
Test("");
Test(" ");
Test(" x ");
Test("x");
Test(new string('x', 1000));
Test(" " + new string('x', 1000) + " ");
Test(new string(' ', 1000));
}
static void Test(string text)
{
bool expectedResult = text.Trim().Length == 0;
string title = string.Format("Length={0}, result={1}", text.Length,
expectedResult);
var results = TestSuite.Create(title, text, expectedResult)
/* .Add(x => x.Trim().Length == 0, "Trim().Length == 0")
.Add(x => x.Trim() == "", "Trim() == \"\"")
.Add(x => x.Trim().Equals(""), "Trim().Equals(\"\")")
.Add(x => x.Trim() == string.Empty, "Trim() == string.Empty")
.Add(x => x.Trim().Equals(string.Empty), "Trim().Equals(string.Empty)")
*/
.Add(OriginalIsEmptyOrWhitespace)
.Add(IsEmptyOrWhitespaceForLoop)
.Add(IsEmptyOrWhitespaceForLoopReversed)
.Add(IsEmptyOrWhitespaceForLoopHoistedLength)
.RunTests()
.ScaleByBest(ScalingMode.VaryDuration);
results.Display(ResultColumns.NameAndDuration | ResultColumns.Score,
results.FindBest());
}
public static bool OriginalIsEmptyOrWhitespace(string text)
{
if (text.Length == 0)
{
return true;
}
foreach (char c in text)
{
if (c==' ' || c=='\t' || c=='\r' || c=='\n')
{
continue;
}
return false;
}
return true;
}
public static bool IsEmptyOrWhitespaceForLoop(string text)
{
for (int i=0; i < text.Length; i++)
{
char c = text[i];
if (c==' ' || c=='\t' || c=='\r' || c=='\n')
{
continue;
}
return false;
}
return true;
}
public static bool IsEmptyOrWhitespaceForLoopReversed(string text)
{
for (int i=text.Length-1; i >= 0; i--)
{
char c = text[i];
if (c==' ' || c=='\t' || c=='\r' || c=='\n')
{
continue;
}
return false;
}
return true;
}
public static bool IsEmptyOrWhitespaceForLoopHoistedLength(string text)
{
int length = text.Length;
for (int i=0; i < length; i++)
{
char c = text[i];
if (c==' ' || c=='\t' || c=='\r' || c=='\n')
{
continue;
}
return false;
}
return true;
}
}
Results:
============ Length=0, result=True ============
OriginalIsEmptyOrWhitespace 30.012 1.00
IsEmptyOrWhitespaceForLoop 30.802 1.03
IsEmptyOrWhitespaceForLoopReversed 32.944 1.10
IsEmptyOrWhitespaceForLoopHoistedLength 35.113 1.17
============ Length=1, result=True ============
OriginalIsEmptyOrWhitespace 31.150 1.04
IsEmptyOrWhitespaceForLoop 30.051 1.00
IsEmptyOrWhitespaceForLoopReversed 31.602 1.05
IsEmptyOrWhitespaceForLoopHoistedLength 33.383 1.11
============ Length=3, result=False ============
OriginalIsEmptyOrWhitespace 30.221 1.00
IsEmptyOrWhitespaceForLoop 30.131 1.00
IsEmptyOrWhitespaceForLoopReversed 34.502 1.15
IsEmptyOrWhitespaceForLoopHoistedLength 35.690 1.18
============ Length=1, result=False ============
OriginalIsEmptyOrWhitespace 31.626 1.05
IsEmptyOrWhitespaceForLoop 30.005 1.00
IsEmptyOrWhitespaceForLoopReversed 32.383 1.08
IsEmptyOrWhitespaceForLoopHoistedLength 33.666 1.12
============ Length=1000, result=False ============
OriginalIsEmptyOrWhitespace 30.177 1.00
IsEmptyOrWhitespaceForLoop 33.207 1.10
IsEmptyOrWhitespaceForLoopReversed 30.867 1.02
IsEmptyOrWhitespaceForLoopHoistedLength 31.837 1.06
============ Length=1002, result=False ============
OriginalIsEmptyOrWhitespace 30.217 1.01
IsEmptyOrWhitespaceForLoop 30.026 1.00
IsEmptyOrWhitespaceForLoopReversed 34.162 1.14
IsEmptyOrWhitespaceForLoopHoistedLength 34.860 1.16
============ Length=1000, result=True ============
OriginalIsEmptyOrWhitespace 30.303 1.01
IsEmptyOrWhitespaceForLoop 30.018 1.00
IsEmptyOrWhitespaceForLoopReversed 35.475 1.18
IsEmptyOrWhitespaceForLoopHoistedLength 40.927 1.36
myString.Trim().Length == 0 Took : 421 ms
myString.Trim() == '' took : 468 ms
if (myString.Trim().Equals("")) Took : 515 ms
if (myString.Trim() == String.Empty) Took : 484 ms
if (myString.Trim().Equals(String.Empty)) Took : 500 ms
if (string.IsNullOrEmpty(myString.Trim())) Took : 437 ms
In my tests, it looks like myString.Trim().Length == 0 and surprisingly, string.IsNullOrEmpty(myString.Trim()) were consistently the fastest. The results above are a typical result from doing 10,000,000 comparisons.
Since I just started I can't comment so here it is.
if (String.IsNullOrEmpty(myString.Trim()))
Trim()
call will fail if myString is null since you can't call methods in a object that is null (NullReferenceException).
So the correct syntax would be something like this:
if (!String.IsNullOrEmpty(myString))
{
string trimmedString = myString.Trim();
//do the rest of you code
}
else
{
//string is null or empty, don't bother processing it
}
Checking the length of a string for being zero is the most efficient way to test for an empty string, so I would say number 1:
if (myString.Trim().Length == 0)
The only way to optimize this further might be to avoid trimming by using a compiled regular expression (Edit: this is actually much slower than using Trim().Length).
Edit: The suggestion to use Length came from a FxCop guideline. I've also just tested it: it's 2-3 times faster than comparing to an empty string. However both approaches are still extremely fast (we're talking nanoseconds) - so it hardly matters which one you use. Trimming is so much more of a bottleneck it's hundreds of times slower than the actual comparison at the end.