How do I strip non-alphanumeric characters (including spaces) from a string?

前端 未结 8 2317
借酒劲吻你
借酒劲吻你 2020-12-14 00:10

How do I strip non alphanumeric characters from a string and loose spaces in C# with Replace?

I want to keep a-z, A-Z, 0-9 and nothing more (not even \" \" spaces).<

相关标签:
8条回答
  • 2020-12-14 00:34

    In your regex, you have excluded the spaces from being matched (and you haven't used Regex.Replace() which I had overlooked completely...):

    result = Regex.Replace("Hello there(hello#)", @"[^A-Za-z0-9]+", "");
    

    should work. The + makes the regex a bit more efficient by matching more than one consecutive non-alphanumeric character at once instead of one by one.

    If you want to keep non-ASCII letters/digits, too, use the following regex:

    @"[^\p{L}\p{N}]+"
    

    which leaves

    BonjourmesélèvesGutenMorgenliebeSchüler
    

    instead of

    BonjourmeslvesGutenMorgenliebeSchler
    
    0 讨论(0)
  • 2020-12-14 00:37

    In .Net 4.0 you can use the IsNullOrWhitespace method of the String class to remove the so called white space characters. Please take a look here http://msdn.microsoft.com/en-us/library/system.string.isnullorwhitespace.aspx However as @CodeInChaos pointed there are plenty of characters which could be considered as letters and numbers. You can use a regular expression if you only want to find A-Za-z0-9.

    0 讨论(0)
  • 2020-12-14 00:40

    Or you can do this too:

        public static string RemoveNonAlphanumeric(string text)
        {
            StringBuilder sb = new StringBuilder(text.Length);
    
            for (int i = 0; i < text.Length; i++)
            {
                char c = text[i];
                if (c >= 'a' && c <= 'z' || c >= 'A' && c <= 'Z' || c >= '0' && c <= '9')
                    sb.Append(text[i]);
            }
    
            return sb.ToString();
        }
    

    Usage:

    string text = SomeClass.RemoveNonAlphanumeric("text LaLa (lol) á ñ $ 123 ٠١٢٣٤");
    
    //text: textLaLalol123
    
    0 讨论(0)
  • 2020-12-14 00:41

    And as a replace operation as an extension method:

    public static class StringExtensions
    {
        public static string ReplaceNonAlphanumeric(this string text, char replaceChar)
        {
            StringBuilder result = new StringBuilder(text.Length);
    
            foreach(char c in text)
            {
                if(c >= 'a' && c <= 'z' || c >= 'A' && c <= 'Z' || c >= '0' && c <= '9')
                    result.Append(c);
                else
                    result.Append(replaceChar);
            }
    
            return result.ToString();
        } 
    }
    

    And test:

    [TestFixture]
    public sealed class StringExtensionsTests
    {
        [Test]
        public void Test()
        {
            Assert.AreEqual("text_LaLa__lol________123______", "text LaLa (lol) á ñ $ 123 ٠١٢٣٤".ReplaceNonAlphanumeric('_'));
        }
    }
    
    0 讨论(0)
  • 2020-12-14 00:44
    var text = "Hello there(hello#)";
    
    var rgx = new Regex("[^a-zA-Z0-9]");
    
    text = rgx.Replace(text, string.Empty);
    
    0 讨论(0)
  • 2020-12-14 00:52

    Use following regex to strip those all characters from the string using Regex.Replace

    ([^A-Za-z0-9\s])
    
    0 讨论(0)
提交回复
热议问题