I have a website which allows users to comment on photos. Of course, users leave comments like:
\'OMGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG
var nonRepeatedChars = myString.ToCharArray().Distinct().Where(c => !char.IsWhiteSpace(c) || !myString.Contains(c)).ToString();
Regex would be overkill. Try this:
public static string RemoveRepeatedChars(String input, int maxRepeat)
{
if(input.Length==0)return input;
StringBuilder b = new StringBuilder;
Char[] chars = input.ToCharArray();
Char lastChar = chars[0];
int repeat = 0;
for(int i=1;i<input.Length;i++){
if(chars[i]==lastChar && ++repeat<maxRepeat)
{
b.Append(chars[i]);
}
else
{
b.Append(chars[i]);
repeat=0;
lastChar = chars[i];
}
}
return b.ToString();
}
Distinct() will remove all duplicates, however it will not see "A" and "a" as the same, obviously.
Console.WriteLine(new string("Asdfasdf".Distinct().ToArray()));
Outputs "Asdfa"
Edit : awful suggestion, please don't read, I truly deserve my -1 :)
I found here on technical nuggets something like what you're looking for.
There's nothing to do except a very long regex, because I've never heard about a regex sign for repetition ...
It's a total example, I won't paste it here but I think this will totally answer your question.
var test = "OMMMMMGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGMMM";
test.Distinct().Select(c => c.ToString()).ToList()
.ForEach(c =>
{
while (test.Contains(c + c))
test = test.Replace(c + c, c);
}
);
Do you specifically want to shorten the strings in the code, or would it be enough to simply fail validation and present the form to the user again with a validation error? Something like "Too many repeated characters."
If the latter is acceptable, @"(\w)\1{2}"
should match characters of 3 or more (interpreted as "repeated" two or more times).
Edit: As @Piskvor pointed out, this will match on exactly 3 characters. It works fine for matching, but not for replacing. His version, @"(\w)\1{2,}"
, would work better for replacing. However, I'd like to point out that I think replacing wouldn't be the best practice here. Better to just have the form fail validation than to try to scrub the text being submitted, because there likely will be edge cases where you turn otherwise readable (even if unreasonable) text into nonsense.