Client-side Javascript code to strip bogus HTML from CKEditor

筅森魡賤 提交于 2020-01-04 01:28:05

问题


I believe this may be related to Need Pure/jQuery Javascript Solution For Cleaning Word HTML From Text Area

But in my case I am using CKEditor; however, before sending the data to the server (or after receiving it back) I'd like to strip out "junk" HTML tags and comments such as those that appear when pasting from recent (2007 or later) versions of Microsoft Office. Because the server-side here is a third-party application, I'd prefer to do this client side if I can. Yes, I am aware of the security risks of doing that; this is just meant to sanitize data in common use cases.

Are there any common techniques or existing libraries (especially jQuery-friendly) that can do this? Note, I am not looking to encode or strip all HTML, only the Office-related crud.


回答1:


Did you try CKEditor built in Word clean up functionality? It seems to be run automatically when using the "Paste From Word" dialog, but can also be used from your code. I'm not an expert on CKEditor API, so there might be a more efficient or correct way of doing this, but this seems to work on the current release (3.3.1):

function cleanUp() {

    if (!CKEDITOR.cleanWord) {
        // since the filter is lazily loaded by the pastefromword plugin we need to add it ourselves. 
        // We use the same function as the callback for when the cleanup filter is loaded. Change the script path to the correct one
        CKEDITOR.scriptLoader.load("../plugins/pastefromword/filter/default.js", cleanUp, null, false, true );
        alert('loading script for the first usage');
    } else { // The cleanWord is available for use

        // change to the correct editor instance
        var editor = CKEDITOR.instances.editor1;
        // perform the clean up
        var cleanedUpData = CKEDITOR.cleanWord(editor .getData(),  editor );

        // do something with the clean up
        alert(cleanedUpData);
    }
}

cleanUp();

If you're not happy with this clean up you can modify default.js for your clean up needs. There are some configuration options available for the cleanup, check http://docs.cksource.com/ckeditor_api/symbols/CKEDITOR.config.html (search for "pasteFromWord" options).

If you need something more advanced, but that will require a server access, I suggest you check WordOff (http://wordoff.org/). You might be able to build a proxy and jsonp wrapper around their service so you can use it from the client without a server installation.



来源:https://stackoverflow.com/questions/3391288/client-side-javascript-code-to-strip-bogus-html-from-ckeditor

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!