Remove BOM (ï»¿) from imported .csv file

后端未结

关注

 6  1658

温柔的废话 2020-11-27 20:40

I want to delete the BOM from my imported file, but it just doesn\'t seem to work.

I tried to preg_replace(\'/[\\x00-\\x1F\\x80-\\xFF]/\', \'\', $file);

6条回答

余生分开走 (楼主)

2020-11-27 20:55

Isn't the BOM there to give you a clue on how to reencode the input to something your script/app/database needs? Just deleting isn't gonna help.

This is how I force a string (drawn from a file with file_get_contents()) to be encoded in UTF-8 and get rid of the BOM as well:

switch (true) { 
    case (substr($string,0,3) == "\xef\xbb\xbf") :
        $string = substr($string, 3);
        break;
    case (substr($string,0,2) == "\xfe\xff") :                            
        $string = mb_convert_encoding(substr($string, 2), "UTF-8", "UTF-16BE");
        break;
    case (substr($string,0,2) == "\xff\xfe") :                            
        $string = mb_convert_encoding(substr($string, 2), "UTF-8", "UTF-16LE");
        break;
    case (substr($string,0,4) == "\x00\x00\xfe\xff") :
        $string = mb_convert_encoding(substr($string, 4), "UTF-8", "UTF-32BE");
        break;
    case (substr($string,0,4) == "\xff\xfe\x00\x00") :
        $string = mb_convert_encoding(substr($string, 4), "UTF-8", "UTF-32LE");
        break;
    default:
        $string = iconv(mb_detect_encoding($string, mb_detect_order(), true), "UTF-8", $string);
};

0 讨论(0)

查看其它6个回答