Removing hidden characters from within strings

前端 未结 8 1286
既然无缘
既然无缘 2020-12-01 10:37

My problem:

I have a .NET application that sends out newsletters via email. When the newsletters are viewed in outlook, outlook displays a question mark in place

8条回答
  •  遥遥无期
    2020-12-01 10:55

    I usually use this regular expression to replace all non-printable characters.

    By the way, most of the people think that tab, line feed and carriage return are non-printable characters, but for me they are not.

    So here is the expression:

    string output = Regex.Replace(input, @"[^\u0009\u000A\u000D\u0020-\u007E]", "*");
    
    • ^ means if it's any of the following:
    • \u0009 is tab
    • \u000A is linefeed
    • \u000D is carriage return
    • \u0020-\u007E means everything from space to ~ -- that is, everything in ASCII.

    See ASCII table if you want to make changes. Remember it would strip off every non-ASCII character.

    To test above you can create a string by yourself like this:

        string input = string.Empty;
    
        for (int i = 0; i < 255; i++)
        {
            input += (char)(i);
        }
    

提交回复
热议问题