How to output unicode string to RTF (using C#)

前端 未结 4 1624
小蘑菇
小蘑菇 2020-11-29 07:15

I\'m trying to output unicode string into RTF format. (using c# and winforms)

From wikipedia:

If a Unicode escape is required, the control wor

4条回答
  •  萌比男神i
    2020-11-29 08:11

    Provided that all the characters that you're catering for exist in the Basic Multilingual Plane (it's unlikely that you'll need anything more), then a simple UTF-16 encoding should suffice.

    Wikipedia:

    All possible code points from U+0000 through U+10FFFF, except for the surrogate code points U+D800–U+DFFF (which are not characters), are uniquely mapped by UTF-16 regardless of the code point's current or future character assignment or use.

    The following sample program illustrates doing something along the lines of what you want:

    static void Main(string[] args)
    {
        // ë
        char[] ca = Encoding.Unicode.GetChars(new byte[] { 0xeb, 0x00 });
        var sw = new StreamWriter(@"c:/helloworld.rtf");
        sw.WriteLine(@"{\rtf
    {\fonttbl {\f0 Times New Roman;}}
    \f0\fs60 H" + GetRtfUnicodeEscapedString(new String(ca)) + @"llo, World!
    }"); 
        sw.Close();
    }
    
    static string GetRtfUnicodeEscapedString(string s)
    {
        var sb = new StringBuilder();
        foreach (var c in s)
        {
            if (c <= 0x7f)
                sb.Append(c);
            else
                sb.Append("\\u" + Convert.ToUInt32(c) + "?");
        }
        return sb.ToString();
    }
    

    The important bit is the Convert.ToUInt32(c) which essentially returns the code point value for the character in question. The RTF escape for unicode requires a decimal unicode value. The System.Text.Encoding.Unicode encoding corresponds to UTF-16 as per the MSDN documentation.

提交回复
热议问题