LoadFromFile with Unicode data

后端 未结 2 492
攒了一身酷
攒了一身酷 2020-12-31 18:24

My input file(f) has some Unicode (Swedish) that isn\'t being read correctly.

Neither of these approaches works, although they give different results:



        
2条回答
  •  太阳男子
    2020-12-31 19:05

    In order to load a Unicode text file you need to know its encoding. If the file has a Byte Order Mark (BOM), then you can simply call LoadFromFile(FileName) and the RTL will use the BOM to determine the encoding.

    If the file does not have a BOM then you need to explicitly specify the encoding, e.g.

    LoadFromFile(FileName, TEncoding.UTF8);
    LoadFromFile(FileName, TEncoding.Unicode);//UTF-16 LE
    LoadFromFile(FileName, TEncoding.BigEndianUnicode);//UTF-16 BE
    

    For some reason, unknown to me, there is no built in support for UTF-32, but if you had such a file then it would be easy enough to add a TEncoding instance to handle that.

提交回复
热议问题