I have a site where users can post stuff (as in forums, comments, etc) using a customised implementation of TinyMCE. A lot of them like to copy & paste from Word, which
In my case, this worked just fine:
$text = strip_tags($text, '');
');
Rather than trying to pull out stuff you don't want such as embedded word xml, you can just specify you're allowed tags.