C# StreamWriter writes extra bytes to the Stream

自作多情 提交于 2021-02-05 07:51:09

问题


var memStream = new MemoryStream();
using (var sw = new StreamWriter(memStream, Encoding.UTF8, 4194304 /* 4 MiB */, leaveOpen: true))
{
     var str = new string(Enumerable.Repeat(' ', 10240 /* 10 * KiB */).ToArray());
     Console.WriteLine(str.Length);
     Console.WriteLine(Encoding.UTF8.GetBytes(str).Length);
     sw.Write(str);
     sw.Flush();
     Console.WriteLine(memStream.Length);
}
// Output
// ---------
// 10240
// 10240
// 10243

// Output which I was expecting
// ---------
// 10240
// 10240
// 10240

I checked the StreamWriter.Write(String) documentation on MSDN but I didn't find anything which mentions that this API can write extra bytes to the stream. (MSDN Doc StreamWriter.Write). I am using .NET Core 3.1, but I am guessing this behavior also holds for Core 2.0 and Framework although I have not explicitly tested my hypothesis for them. I read the StreamWriter documentation thoroughly, I don't find any mention of such a behavior. Is this a bug or expected behavior or am I missing something ?


回答1:


When I run this locally i get

10240
10240
10243

On further inspection the extra 3 bytes appear to be at the beginning of the stream 239 187 191 or EF BB BF in hex. This is the Byte Order Mark (BOM) https://en.wikipedia.org/wiki/Byte_order_mark

To remove these extra characters from the ouptut use new UTF8Encoding(false) to omit the BOM, instead of Encoding.UTF8 in the creation of the StreamWriter

using (var sw = new StreamWriter(memStream, new UTF8Encoding(false), 4194304 /* 4 MiB */, leaveOpen: true))



回答2:


You could prevent the output of the BOM by creating a UTF8Encoding that should not emit an UTF8 identifier by using new UTF8Encoding(false):

var memStream = new MemoryStream();
using (var sw = new StreamWriter(memStream, new UTF8Encoding(false), 4194304 /* 4 MiB */, leaveOpen: true))
{
    var str = new string(Enumerable.Repeat(' ', 10240 /* 10 * KiB */).ToArray());
    Console.WriteLine(str.Length);
    Console.WriteLine(Encoding.UTF8.GetBytes(str).Length);
    sw.Write(str);
    sw.Flush();
    Console.WriteLine(memStream.Length);
}


来源:https://stackoverflow.com/questions/60298392/c-sharp-streamwriter-writes-extra-bytes-to-the-stream

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!