Scala java.nio.charset.UnmappableCharacterException: Input length = 1

醉酒当歌 提交于 2019-12-23 02:27:07

问题


I've found several questions with similar titles, but couldn't seem to use any to resolve my issue. I Can't seem to load my .csv file:

val source = io.Source.fromFile("C:/mon_usatotaldat.csv")

Returns:

java.nio.charset.UnmappableCharacterException: Input length = 1

So I tried:

val source = io.Source.fromFile("UTF-8", "C:/mon_usatotaldat.csv")

and got:

java.nio.charset.IllegalCharsetNameException: C:/mon_usatotaldat.csv

I guess UTF-8 wouldn't work, if the file isn't in UTF-8 format, so that makes sense, but I don't know what to do next.

I've managed to discover the encoding is windows-1252 using:

val source = io.Source.fromFile("C:/mon_usatotaldat.csv").codec.decodingReplaceWith("UTF-8")

But this didn't do what I had expected, which was convert the file to UTF-8. I have no Idea how to work with it.

Another thing I've tried was:

val source = io.Source.fromFile("windows-1252","C:/mon_usatotaldat.csv")

But that returned:

java.nio.charset.IllegalCharsetNameException: C:/mon_usatotaldat.csv

Please help. Thanks in advance.


回答1:


Try mapping your excel file to UTF-8 first and then try val source = io.Source.fromFile("UTF-8", "C:/mon_usatotaldat.csv")

To map to UTF-8 try:

(1) Open an Excel file where you have the info (.xls, .xlsx)

(2) In Excel, choose "CSV (Comma Delimited) (*.csv) as the file type and save as that type.

(3) In NOTEPAD (found under "Programs" and then Accessories in Start menu), open the saved .csv file in Notepad

(4) Then choose -> Save As..and at the bottom of the "save as" box, there is a select box labelled as "Encoding". Select UTF-8 (do NOT use ANSI or you lose all accents etc). After selecting UTF-8, then save the file to a slightly different file name from the original.

This file is in UTF-8 and retains all characters and accents and can be imported, for example, into MySQL and other database programs.

Reference: Excel to CSV with UTF8 encoding

Hope this helps!




回答2:


Set up an InputStreamReader to correctly read windows-1252. Don't bother with intermediate UTF-8.



来源:https://stackoverflow.com/questions/35141258/scala-java-nio-charset-unmappablecharacterexception-input-length-1

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!