Consecutive control characters in Quoted Printable not decoding correctly

后端 未结 1 1479
萌比男神i
萌比男神i 2021-01-27 09:54

I have a mail processing engine that reads in emails (usually UTF-8 encrypted) and processes them. I found a neat solution here for how to interpret the control characters. Bu

相关标签:
1条回答
  • 2021-01-27 10:34

    Here is a piece of code I found on SO looking for quoted printable :

    private static string Decode(string input, string bodycharset)
    {
        var i = 0;
        var output = new List<byte>();
        while (i < input.Length)
        {
            if (input[i] == '=' && input[i + 1] == '\r' && input[i + 2] == '\n')
            {
                //Skip
                i += 3;
            }
            else if (input[i] == '=')
            {
                string sHex = input;
                sHex = sHex.Substring(i + 1, 2);
                int hex = Convert.ToInt32(sHex, 16);
                byte b = Convert.ToByte(hex);
                output.Add(b);
                i += 3;
            }
            else
            {
                output.Add((byte)input[i]);
                i++;
            }
        }
        if (String.IsNullOrEmpty(bodycharset))
            return Encoding.UTF8.GetString(output.ToArray());
        else
            return Encoding.GetEncoding(bodycharset).GetString(output.ToArray());
    }
    

    Source : Decoding Quoted printable message

    Decode("Elke=E2=80=99s motto", "utf-8") -> Elke’s motto

    0 讨论(0)
提交回复
热议问题