How can I strip punctuation from a string?

前端 未结 15 515
天命终不由人
天命终不由人 2020-12-04 18:47

For the hope-to-have-an-answer-in-30-seconds part of this question, I\'m specifically looking for C#

But in the general case, what\'s the best way to strip punctuati

相关标签:
15条回答
  • 2020-12-04 19:36
    $newstr=ereg_replace("[[:punct:]]",'',$oldstr);
    
    0 讨论(0)
  • 2020-12-04 19:39
    new string(myCharCollection.Where(c => !char.IsPunctuation(c)).ToArray());
    
    0 讨论(0)
  • 2020-12-04 19:40

    I faced the same issue and was concerned about the performance impact of calling the IsPunctuation for every single check.

    I found this post: http://www.dotnetperls.com/char-ispunctuation.

    Accross the lines: char.IsPunctuation also handles Unicode on top of ASCII. The method matches a bunch of characters including control characters. By definiton, this method is heavy and expensive.

    The bottom line is that I finally didn't go for it because of its performance impact on my ETL process.

    I went for the custom implemetation of dotnetperls.

    And jut FYI, here is some code deduced from the previous answers to get the list of all punctuation characters (excluding the control ones):

    var punctuationCharacters = new List<char>();
    
            for (int i = char.MinValue; i <= char.MaxValue; i++)
            {
                var character = Convert.ToChar(i);
    
                if (char.IsPunctuation(character) && !char.IsControl(character))
                {
                    punctuationCharacters.Add(character);
                }
            }
    
            var commaSeparatedValueOfPunctuationCharacters = string.Join("", punctuationCharacters);
    
            Console.WriteLine(commaSeparatedValueOfPunctuationCharacters);
    

    Cheers, Andrew

    0 讨论(0)
提交回复
热议问题