Why isn\'t it possible to use fluent language on string
?
For example:
var x = \"asdf1234\";
var y = new string(x.TakeWhile(char.IsLetter
Assuming that you're looking predominantly for performance, then something like this should be substantially faster than any of your examples:
string x = "asdf1234";
string y = x.LeadingLettersOnly();
// ...
public static class StringExtensions
{
public static string LeadingLettersOnly(this string source)
{
if (source == null)
throw new ArgumentNullException("source");
if (source.Length == 0)
return source;
char[] buffer = new char[source.Length];
int bufferIndex = 0;
for (int sourceIndex = 0; sourceIndex < source.Length; sourceIndex++)
{
char c = source[sourceIndex];
if (!char.IsLetter(c))
break;
buffer[bufferIndex++] = c;
}
return new string(buffer, 0, bufferIndex);
}
}
Edited for the release of .Net Core 2.1
Repeating the test for the release of .Net Core 2.1, I get results like this
1000000 iterations of "Concat" took 842ms.
1000000 iterations of "new String" took 1009ms.
1000000 iterations of "sb" took 902ms.
In short, if you are using .Net Core 2.1 or later, Concat
is king.
See MS blog post for more details.
I've made this the subject of another question but more and more, that is becoming a direct answer to this question.
I've done some performance testing of 3 simple methods of converting an IEnumerable<char>
to a string
, those methods are
new string
return new string(charSequence.ToArray());
Concat
return string.Concat(charSequence)
StringBuilder
var sb = new StringBuilder();
foreach (var c in charSequence)
{
sb.Append(c);
}
return sb.ToString();
In my testing, that is detailed in the linked question, for 1000000
iterations of "Some reasonably small test data"
I get results like this,
1000000 iterations of "Concat" took 1597ms.
1000000 iterations of "new string" took 869ms.
1000000 iterations of "StringBuilder" took 748ms.
This suggests to me that there is not good reason to use string.Concat
for this task. If you want simplicity use the new string approach and if want performance use the StringBuilder.
I would caveat my assertion, in practice all these methods work fine, and this could all be over optimization.
Why isn't it possible to use fluent language on string?
It is possible. You did it in the question itself:
var y = new string(x.TakeWhile(char.IsLetter).ToArray());
Isn't there a better way to convert
IEnumerable<char>
to string?
(My assumption is:)
The framework does not have such a constructor because strings are immutable, and you'd have to traverse the enumeration twice in order to pre-allocate the memory for the string. This is not always an option, especially if your input is a stream.
The only solution to this is to push to a backing array or StringBuilder
first, and reallocate as the input grows. For something as low-level as a string, this probably should be considered too-hidden a mechanism. It also would push perf problems down into the string class by encouraging people to use a mechanism that cannot be as-fast-as-possible.
These problems are solved easily by requiring the user to use the ToArray
extension method.
As others have pointed out, you can achieve what you want (perf and expressive code) if you write support code, and wrap that support code in an extension method to get a clean interface.
return new string(foo.Select(x => x).ToArray());
You can very often do better performance-wise. But what does that buy you? Unless this is really the bottle neck for your application and you have measured it to be I would stick to the Linq TakeWhile()
version: It is the most readable and maintainable solution, and that is what counts for most of all applications.
If you really are looking for raw performance you could do the conversion manually - the following was around a factor 4+ (depending on input string length) faster than TakeWhile()
in my tests - but I wouldn't use it personally unless it was critical:
int j = 0;
for (; j < input.Length; j++)
{
if (!char.IsLetter(input[j]))
break;
}
string output = input.Substring(0, j);
How about this to convert IEnumerable<char>
to string
:
string.Concat(x.TakeWhile(char.IsLetter));