Remove BOM () from imported .csv file

白昼怎懂夜的黑 提交于 2019-11-28 10:22:54

Try this:

function removeBomUtf8($s){
  if(substr($s,0,3)==chr(hexdec('EF')).chr(hexdec('BB')).chr(hexdec('BF'))){
       return substr($s,3);
   }else{
       return $s;
   }
}

Read data with file_get_contents then use mb_convert_encoding to convert to UTF-8

UPDATE

$filepath = get_bloginfo('template_directory')."/testing.csv";
$fileContent = file_get_contents($filepath);
$fileContent = mb_convert_encoding($fileContent, "UTF-8");
$lines = explode("\n", $fileContent);
foreach($lines as $line) {
    $conls = explode(";", $line);
    // etc...
}

If the character encoding functions don't work for you (as is the case for me in some situations) and you know for a fact that your file always has a BOM, you can simply use an fseek() to skip the first 3 bytes, which is the length of the BOM.

$fp = fopen("testing.csv", "r");
fseek($fp, 3);

You should also not use explode() to split your CSV lines and columns because if your column contains the character by which you split, you will get an incorrect result. Use this instead:

while (!feof($fp)) {
    $arrayLine = fgetcsv($fp, 0, ";", '"');
    ...
}

Using @Tomas'z answer as the main inspiration for this, and @Nolwennig's comment:

// Strip byte order marks from a string
function strip_bom($string, $type = 'utf8') {
    $length = 0;

    switch($type) {
        case 'utf8':
            $length = substr($string, 0, 3) === chr(0xEF) . chr(0xBB) . chr(0xBF) ? 3 : 0;
        break;

        case 'utf16_little_endian':
            $length = substr($string, 0, 2) === chr(0xFF) . chr(0xFE) ? 2 : 0;
        break;
    }

    return $length ? substr($string, $length) : $string;
}
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!