问题
The following code prints something like °Ð½Ð´Ð¸Ñ-ÐÑпаниÑ
getDirectoryContents "path/to/directory/that/contains/files/with/nonASCII/names"
>>= mapM_ putStrLn
Looks like it is a ghc bug and it is fixed already in repository. But what to do until everybody upgrade ghc?
The last time I encountered such the problem (it was few years ago, btw), I used utf8-string package to convert strings, but I don't remember how I did it, and ghc unicode support was changed visibly last years.
So, what is the best (or at least working) way to get directory contents with full unicode support?
ghc version 7.0.4 locale en_US.UTF-8
回答1:
Here's a simple workaround using decodeString and encodeString from utf8-string.
import System.Directory
import qualified Codec.Binary.UTF8.String as UTF8
main = do
getDirectoryContents "." >>= mapM_ (putStrLn . UTF8.decodeString)
putStrLn "------------"
readFile (UTF8.encodeString "brøken-file-nåme.txt") >>= putStrLn
Output:
.
..
brøken-file-nåme.txt
Broken.hs
------------
hello
回答2:
I would recommend looking at system-filepath, which provides an abstract datatype for representing filepaths. I've used it extensively for some internal code and it works wonderfully.
来源:https://stackoverflow.com/questions/6806686/system-directory-getdirectorycontents-unicode-support