Storing a string as UTF8 in C#

前端 未结 4 803
忘了有多久
忘了有多久 2021-02-01 14:30

I\'m doing a lot of string manipulation in C#, and really need the strings to be stored one byte per character. This is because I need gigabytes of text simultaneously in memory

4条回答
  •  没有蜡笔的小新
    2021-02-01 15:10

    As I can see your problem is that char in C# is occupying 2 bytes, instead of one.

    One way to read a text file is to open it with :

        System.IO.FileStream fs = new System.IO.FileStream(file, System.IO.FileMode.Open);
        System.IO.BinaryReader br = new System.IO.BinaryReader(fs);
    
        byte[] buffer = new byte[1024];
        int read = br.Read(buffer, 0, (int)fs.Length);
    
        br.Close();
        fs.Close(); 
    

    And this way you are reading the bytes from the file. I tried it with *.txt files encoded in UTF-8 that is 2 bytes per char, and ANSI that is 1 byte per char.

提交回复
热议问题