How do I remove Word markup crap when inserting to a form?

淺唱寂寞╮ 提交于 2019-12-01 20:57:23

问题


I'm building a CMS in PHP and one dread I have is that the users will have to fill the data in from existing Word (and Excel, but nevermind that) documents. Now, I've seen what happens when they carelessly copy and paste from Word to a textarea: the database got filled with crap markup.

Now, I could certainly strip all markup myself, but I'd have to start learning about it first. So I ask you: have you tested some functionality - plugins of the usual suspects (tinyMCE, FCKeditor, etc) that helps here? Bonus for the least intrusive solution.


回答1:


Sadly most of the HTML editor controls I've used either:

  1. Have a button to strip out various elements of mark up (word, html, script, etc)
  2. Strip out all markup on paste via JavaScript.

If you leave it to a button, then generally the non-technical users will forget to press it because they don't (some would say "shouldn't have to") care about it :(

With a bit of playing around with Regular Expressions (now you have another problem ;)) you could do something similar to 2 but just for word xml.




回答2:


I have found FCKEditor to handle text yanked and thrown at it from Word documents, much better than tinyMCE.




回答3:


Ok, I found a plugin for TinyMCE that apparently does what I wanted. Still, this asks for the users to press a button to paste, which is a bit less than ideal. Anything better?




回答4:


ASP.NET? Telerik's RadEditor has worked very well for me



来源:https://stackoverflow.com/questions/391291/how-do-i-remove-word-markup-crap-when-inserting-to-a-form

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!