问题
I am using a Javascript file that is a concatenation of other JavaScript files.
Unfortunately, the person who concatenated these JavaScript files together did not use the proper encoding when reading the file, and allowed a BOM for every single JavaScript file to get written to the concatenated JavaScript file.
Does anyone know a simple way to search through the concatenated file and remove any/all BOM markers?
Using PHP or a bash script for Mac OSX would be great.
回答1:
See also: Using awk to remove the Byte-order mark
To remove multiple BOMs from anywhere within a text file you can try something similar. Just leave out the ^ anchor:
perl -e 's/\xef\xbb\xbf//;' -pi~ file.js
(This edits the file in-place. But creates a backup file.js~.)
回答2:
I normally do it using vim:
vim -c "set nobomb" -c wq! myfile
回答3:
fetch BOM files
grep -rIlo $’^\xEF\xBB\xBF’ ./
remove BOM files
grep -rIlo $’^\xEF\xBB\xBF’ . | xargs sed –in-place -e ‘s/\xef\xbb\xbf//’
exclude .svn dir
grep -rIlo –exclude-dir=”.svn” $’^\xEF\xBB\xBF’ . | xargs sed –in-place -e ‘s/\xef\xbb\xbf//’
- See more at: http://www.a5go.com/how-to-remove-bom-from-utf-8-using-sed.html#
回答4:
I also figured out this solution which works entirely in PHP:
$packed = pack("CCC",0xef,0xbb,0xbf);
$contents = preg_replace('/'.$packed.'/','',$contents);
来源:https://stackoverflow.com/questions/9100728/remove-multiple-boms-from-a-file