问题
Because the constructor of java.io.File takes a java.lang.String as argument, there is seemingly no possibility to tell it which filename encoding to expect when accessing the filesystem layer. So when you generally use UTF-8 as filename encoding and there is some filename containing an umlaut encoded as ISO-8859-1, you are basically **. Is this correct?
Update: because noone seemingly gets it, try it yourself: when creating a new file, the environment variable LC_ALL (on Linux) determines the encoding of the filename. It does not matter what you do inside your source code!
If you want to give a correct answer, demonstrate that you can create a file (using regular Java means) with proper ISO-8859-1 encoding while your JVM assumes LC_ALL=en_US.UTF-8. The filename should contain a character like ö, ü, or ä.
BTW: if you put filenames with encoding not appropriate to LC_ALL into maven's resource path, it will just skip it....
Update II.
Fix this: https://github.com/jjYBdx4IL/filenameenc
ie. make the f.exists() statement become true.
Update III.
The solution is to use java.nio.*, in my case you had to replace File.listFiles() with Files.newDirectoryStream(). I have updated the example at github. BTW: maven seems to still use the old java.io API.... mvn clean fails.
回答1:
The solution is to use the new API and file.encoding
. Demonstration:
fge@alustriel:~/tmp/filenameenc$ echo $LC_ALL
en_US.UTF-8
fge@alustriel:~/tmp/filenameenc$ cat Test.java
import java.io.File;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
public class Test
{
public static void main(String[] args)
{
final String testString = "a/üöä";
final Path path = Paths.get(testString);
final File file = new File(testString);
System.out.println("Files.exists(): " + Files.exists(path));
System.out.println("File exists: " + file.exists());
}
}
fge@alustriel:~/tmp/filenameenc$ install -D /dev/null a/üöä
fge@alustriel:~/tmp/filenameenc$ java Test
Files.exists(): true
File exists: true
fge@alustriel:~/tmp/filenameenc$ java -Dfile.encoding=iso-8859-1 Test
Files.exists(): false
File exists: true
fge@alustriel:~/tmp/filenameenc$
One less reason to use File
!
回答2:
Currently I am sitting at a Windows machine, but assuming you can fetch the file system encoding:
String encoding = System.getProperty("file.encoding");
String encoding = system.getEnv("LC_ALL");
Then you have the means to check whether a filename is valid. Mind: Windows can represent Unicode filenames, and my own Linux of course uses UTF-8.
boolean validEncodingForFileName(String name) {
try {
byte[] bytes = name.getBytes(encoding);
String nameAgain = new String(bytes, encoding);
return name.equals(nameAgain); // Nothing lost?
} catch (UnsupportedEncodingException ex) {
return false; // Maybe true, more a JRE limitation.
}
}
You might try whether File is clever enough (I cannot test it):
boolean validEncodingForFileName(String name) {
return new File(name).getCanonicalPath().endsWith(name);
}
回答3:
String can represent any encoding:
new File("the file name with \u00d6")
or
new File("the file name with Ö")
回答4:
You can set the Encoding while reading and writing the File. as a example when you write to file you can give the encoding to your out put stream writer as follows. new OutputStreamWriter(new FileOutputStream(fileName), "UTF-8")
.
When you read a file you can give the decoding character set as flowing class constructor . InputStreamReader(InputStream in, CharsetDecoder dec)
来源:https://stackoverflow.com/questions/22775758/java-io-file-accessing-files-with-invalid-filename-encodings