How to convert HTML to RTF using JavaScript

泪湿孤枕 提交于 2019-12-07 12:59:44

问题


I have an input HTML file with header and footer. It needs to be converted to RTF. The header/footer of HTML should be repeated in the resultant RTF file.

Is there any plugin to convert HTML to RTF by only using JavaScript??


回答1:


You can use this converter

However it does not address bullet points (ul, li elements)

function convertHtmlToRtf(html) {
  if (!(typeof html === "string" && html)) {
      return null;
  }

  var tmpRichText, hasHyperlinks;
  var richText = html;

  // Singleton tags
  richText = richText.replace(/<(?:hr)(?:\s+[^>]*)?\s*[\/]?>/ig, "{\\pard \\brdrb \\brdrs \\brdrw10 \\brsp20 \\par}\n{\\pard\\par}\n");
  richText = richText.replace(/<(?:br)(?:\s+[^>]*)?\s*[\/]?>/ig, "{\\pard\\par}\n");

  // Empty tags
  richText = richText.replace(/<(?:p|div|section|article)(?:\s+[^>]*)?\s*[\/]>/ig, "{\\pard\\par}\n");
  richText = richText.replace(/<(?:[^>]+)\/>/g, "");

  // Hyperlinks
  richText = richText.replace(
      /<a(?:\s+[^>]*)?(?:\s+href=(["'])(?:javascript:void\(0?\);?|#|return false;?|void\(0?\);?|)\1)(?:\s+[^>]*)?>/ig,
      "{{{\n");
  tmpRichText = richText;
  richText = richText.replace(
      /<a(?:\s+[^>]*)?(?:\s+href=(["'])(.+)\1)(?:\s+[^>]*)?>/ig,
      "{\\field{\\*\\fldinst{HYPERLINK\n \"$2\"\n}}{\\fldrslt{\\ul\\cf1\n");
  hasHyperlinks = richText !== tmpRichText;
  richText = richText.replace(/<a(?:\s+[^>]*)?>/ig, "{{{\n");
  richText = richText.replace(/<\/a(?:\s+[^>]*)?>/ig, "\n}}}");

  // Start tags
  richText = richText.replace(/<(?:b|strong)(?:\s+[^>]*)?>/ig, "{\\b\n");
  richText = richText.replace(/<(?:i|em)(?:\s+[^>]*)?>/ig, "{\\i\n");
  richText = richText.replace(/<(?:u|ins)(?:\s+[^>]*)?>/ig, "{\\ul\n");
  richText = richText.replace(/<(?:strike|del)(?:\s+[^>]*)?>/ig, "{\\strike\n");
  richText = richText.replace(/<sup(?:\s+[^>]*)?>/ig, "{\\super\n");
  richText = richText.replace(/<sub(?:\s+[^>]*)?>/ig, "{\\sub\n");
  richText = richText.replace(/<(?:p|div|section|article)(?:\s+[^>]*)?>/ig, "{\\pard\n");

  // End tags
  richText = richText.replace(/<\/(?:p|div|section|article)(?:\s+[^>]*)?>/ig, "\n\\par}\n");
  richText = richText.replace(/<\/(?:b|strong|i|em|u|ins|strike|del|sup|sub)(?:\s+[^>]*)?>/ig, "\n}");

  // Strip any other remaining HTML tags [but leave their contents]
  richText = richText.replace(/<(?:[^>]+)>/g, "");

  // Prefix and suffix the rich text with the necessary syntax
  richText =
      "{\\rtf1\\ansi\n" + (hasHyperlinks ? "{\\colortbl\n;\n\\red0\\green0\\blue255;\n}\n" : "") + richText +  "\n}";

  return richText;
}



回答2:


No such thing I'm afraid. I checked into this when looking for any HTML to RTF converter. Unfortunately they are a rare item.

Your only option would be the make one based on the RTF specs. https://msdn.microsoft.com/en-us/library/aa140277(v=office.10).aspx




回答3:


After a bit of search i found a working solution that works fine.

https://www.npmjs.com/package/html-to-rtf

with html-to-rtf the conversion is easy (i put a piece of code based on browserify):

var html

ToRtf  = require("html-to-rtf");
var htmlText = "<div>...</div>"; //or whatever html you want to transform
var htmlAsRtf = htmlToRtf.convertHtmlToRtf(htmlText); // html transformed to rtf

This solution worked for me. Without browserify you'll have to find implied js inside downloaded modules with npm and link them to your html page.

Hopes this helps




回答4:


I applied @Samra solution and it was working good. But then I spotted a bug in the output: some text was cut off. After a lot of investigation, it seemed to be about HTML comments (<!-- xxxx -->) weren't being handled properly. My solution was to add this richText transformation as the first one:

// Delete HTML comments
richText = richText.replace(/<!--[\s\S]*?-->/ig,"");


来源:https://stackoverflow.com/questions/29299632/how-to-convert-html-to-rtf-using-javascript

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!