Byte order mark screws up file reading in Java

后端 未结 9 2646
说谎
说谎 2020-11-22 02:55

I\'m trying to read CSV files using Java. Some of the files may have a byte order mark in the beginning, but not all. When present, the byte order gets read along with the r

9条回答
  •  佛祖请我去吃肉
    2020-11-22 03:28

    More simple solution:

    public class BOMSkipper
    {
        public static void skip(Reader reader) throws IOException
        {
            reader.mark(1);
            char[] possibleBOM = new char[1];
            reader.read(possibleBOM);
    
            if (possibleBOM[0] != '\ufeff')
            {
                reader.reset();
            }
        }
    }
    

    Usage sample:

    BufferedReader input = new BufferedReader(new InputStreamReader(new FileInputStream(file), fileExpectedCharset));
    BOMSkipper.skip(input);
    //Now UTF prefix not present:
    input.readLine();
    ...
    

    It works with all 5 UTF encodings!

提交回复
热议问题