Title
Unclosed"; $doc = new DOMDocument(); $doc->loadHTML($fragment); $correctFragment = $doc->getElementsByTagName('body')->item(0)->C14N(); echo $correctFragment;
Situation is a string that results in something like this:
This is some text and here is a bold text then the post stop here....
And what about using PHP's native DOMDocument class? It inherently parses HTML and corrects syntax errors... E.g.:
$fragment = "Title
Unclosed";
$doc = new DOMDocument();
$doc->loadHTML($fragment);
$correctFragment = $doc->getElementsByTagName('body')->item(0)->C14N();
echo $correctFragment;
However, there are several disadvantages of this approach.
Firstly, it wraps the original fragment within the tag. You can get rid of it easily by something like (preg_)replace() or by substituting the
...->C14N()
function by some custom innerHTML() function, as suggested for example at http://php.net/manual/en/book.dom.php#89718.
The second pitfall is that PHP throws an 'invalid tag in Entity' warning if HTML5 or custom tags are used (nevertheless, it will still proceed correctly).