How to read a text file with mixed encodings in Scala or Java?

后端 未结 7 1751
日久生厌
日久生厌 2020-12-07 16:00

I am trying to parse a CSV file, ideally using weka.core.converters.CSVLoader. However the file I have is not a valid UTF-8 file. It is mostly a UTF-8 file but some of the f

7条回答
  •  萌比男神i
    2020-12-07 16:07

    Scala's Codec has a decoder field which returns a java.nio.charset.CharsetDecoder:

    val decoder = Codec.UTF8.decoder.onMalformedInput(CodingErrorAction.IGNORE)
    Source.fromFile(filename)(decoder).getLines().toList
    

提交回复
热议问题