How do I convert an RTF string to a Markdown string (and back) (C# .NET Core, or JS)

落爺英雄遲暮 提交于 2020-01-25 07:56:26

问题


Question:

How do I convert an RTF string to a Markdown string (and back) either in C# or JS, ideally without wrapping an exe?


I have a legacy product that uses .NET's RichTextBox control. Forms that use it save their output in Microsoft's proprietary RTF format. Here is a small example of the output it can generate:

{\\rtf1\\ansi\\ansicpg1252\\uc1\\htmautsp\\deff2{\\fonttbl{\\f0\\fcharset0 Times New Roman;}{\\f2\\fcharset0 GenericSansSerif;}}{\\colortbl\\red0\\green0\\blue0;\\red255\\green255\\blue255;}\\loch\\hich\\dbch\\pard\\plain\\ltrpar\\itap0{\\lang1033\\fs18\\f2\\cf0 \\cf0\\ql{\\f2 {\\ltrch Some content here }\\li0\\ri0\\sa0\\sb0\\fi0\\ql\\par}\r\n}\r\n}

My C# .NET Core Web App needs to be able to use this stored RTF to display a "Rich Text Editor" on a web page, have the ability to update the value, and save in a format that can still be used by the legacy product.

Unfortunately, I am having trouble finding existing/modern web components that can use RTF as input. Most appear to use markdown or a custom JSON format.

Ideally, I would like to:

  1. Convert the existing RTF to Markdown using either:
    • Server side, using C#
    • Client side, using JS
  2. Use the markdown with one of the existing Rich Text Editing web components I've found.
  3. On save, convert the web component's markdown to RTF before persisting

So far, I have tried:

  • Following this CodeProject write-up for creating a custom RTF -> HTML converter: Writing Your Own RTF Converter
    • I can get it to work in a .NET Framework project, but not .NET Core
  • Using this NuGet Package: RtfPipe
    • Throws null reference errors in .NET Core projects
  • Using this Node Module: rtf-to-html
    • Only support a small subset of RTF, creates an entire HTML document instead of a string/subset, breaks on my specific example

Note: The things I've tried are from RTF -> Html because I couldn't find anything for RTF -> Markdown specifically. My hope was that I could, if I had to, do: RTF -> HTML -> Markdown (and in reverse) as a last resort.


回答1:


Sorry for the null reference errors you had with RtfPipe and .Net Core. A resolution to these errors is now documented on the project and involves including the NuGet package System.Text.Encoding.CodePages and registering the code page provider.

#if NETCORE
  // Add a reference to the NuGet package System.Text.Encoding.CodePages for .Net core only
  Encoding.RegisterProvider(CodePagesEncodingProvider.Instance);
#endif
var html = Rtf.ToHtml(rtf);

Since HTML is technically Markdown, you can stop here. Otherwise, you can convert the HTML to Markdown as well using my BracketPipe library. The code would look something like.

using BracketPipe;
using RtfPipe;

private string RtfToMarkdown(string source)
{
  using (var w = new System.IO.StringWriter())
  using (var md = new MarkdownWriter(w))
  {
    Rtf.ToHtml(source, md);
    md.Flush();
    return w.ToString();
  }
}

Markdig is a good library for getting from Markdown to HTML. However, I don't have any good suggestions for getting from HTML to RTF.

Disclaimer: I am the author of the RtfPipe and BracketPipe open source projects



来源:https://stackoverflow.com/questions/46119392/how-do-i-convert-an-rtf-string-to-a-markdown-string-and-back-c-net-core-or

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!