For the hope-to-have-an-answer-in-30-seconds part of this question, I\'m specifically looking for C#
But in the general case, what\'s the best way to strip punctuati
Why not simply:
string s = "sxrdct?fvzguh,bij."; var sb = new StringBuilder(); foreach (char c in s) { if (!char.IsPunctuation(c)) sb.Append(c); } s = sb.ToString();
The usage of RegEx is normally slower than simple char operations. And those LINQ operations look like overkill to me. And you can't use such code in .NET 2.0...
Describes intent, easiest to read (IMHO) and best performing:
s = s.StripPunctuation();
to implement:
public static class StringExtension
{
public static string StripPunctuation(this string s)
{
var sb = new StringBuilder();
foreach (char c in s)
{
if (!char.IsPunctuation(c))
sb.Append(c);
}
return sb.ToString();
}
}
This is using Hades32's algorithm which was the best performing of the bunch posted.
You can use the regex.replace method:
replace(YourString, RegularExpressionWithPunctuationMarks, Empty String)
Since this returns a string, your method will look something like this:
string s = Regex.Replace("Hello!?!?!?!", "[?!]", "");
You can replace "[?!]" with something more sophiticated if you want:
(\p{P})
This should find any punctuation.
Here's a slightly different approach using linq. I like AviewAnew's but this avoids the Aggregate
string myStr = "Hello there..';,]';';., Get rid of Punction";
var s = from ch in myStr
where !Char.IsPunctuation(ch)
select ch;
var bytes = UnicodeEncoding.ASCII.GetBytes(s.ToArray());
var stringResult = UnicodeEncoding.ASCII.GetString(bytes);
This thread is so old, but I'd be remiss not to post a more elegant (IMO) solution.
string inputSansPunc = input.Where(c => !char.IsPunctuation(c)).Aggregate("", (current, c) => current + c);
It's LINQ sans WTF.
The most braindead simple way of doing it would be using string.replace
The other way I would imagine is a regex.replace and have your regular expression with all the appropriate punctuation marks in it.