Reading UTF-8 text files with ReadList
Is it possible to use ReadList to read UTF-8 (or any other) encoded text files using ReadList[..., Word] , or is it ASCII-only? If it's ASCII-only, is it possible to "fix" the encoding of the already read data with good performance (i.e. preserving the performance advantages of ReadList over Import )? Import[..., CharacterEncoding -> "UTF8"] works but it's quite a bit slower than ReadList . $CharacterEncoding has no effect on ReadList Download a sample UTF-8 encoded file here. For testing performance on a large input, see the test file in this question . Here are the timings of the answers on