$string = file_get_contents(\'http://example.com\'); if (\'UTF-8\' === mb_detect_encoding($string)) { $dom = new DOMDocument(); // hack to preserve UTF-8 ch
In case it is definitely the DOM screwing up the encoding, this trick did it for me a while back the other way round (accepting ISO-8859-1 data). DOMDocument should be UTF-8 by default in any case but you can still try:
$dom = new DOMDocument('1.0', 'utf-8');