Difference in writing string vs. char array with System.IO.BinaryWriter

荒凉一梦 提交于 2019-12-07 08:32:23

问题


I’m writing text to a binary file in C# and see a difference in quantity written between writing a string and a character array. I’m using System.IO.BinaryWriter and watching BinaryWriter.BaseStream.Length as the writes occur. These are my results:

using(BinaryWriter bw = new BinaryWriter(File.Open(“data.dat”), Encoding.ASCII))
{
  string value = “Foo”;

  // Writes 4 bytes
  bw.Write(value);

  // Writes 3 bytes 
  bw.Write(value.ToCharArray());
}

I don’t understand why the string overload writes 4 bytes when I’m writing only 3 ASCII characters. Can anyone explain this?


回答1:


The documentation for BinaryWriter.Write(string) states that it writes a length-prefixed string to this stream. The overload for Write(char[]) has no such prefixing.

It would seem to me that the extra data is the length.

EDIT:

Just to be a bit more explicit, use Reflector. You will see that it has this piece of code in there as part of the Write(string) method:

this.Write7BitEncodedInt(byteCount);

It is a way to encode an integer using the least possible number of bytes. For short strings (that we would use day to day that are less than 128 characters), it can be represented using one byte. For longer strings, it starts to use more bytes.

Here is the code for that function just in case you are interested:

protected void Write7BitEncodedInt(int value)
{
    uint num = (uint) value;
    while (num >= 0x80)
    {
        this.Write((byte) (num | 0x80));
        num = num >> 7;
    }
    this.Write((byte) num);
}

After prefixing the the length using this encoding, it writes the bytes for the characters in the desired encoding.




回答2:


From the BinaryWriter.Write(string) docs:

Writes a length-prefixed string to this stream in the current encoding of the BinaryWriter, and advances the current position of the stream in accordance with the encoding used and the specific characters being written to the stream.

This behavior is probably so that when reading the file back in using a BinaryReader the string can be identified. (e.g. 3Foo3Bar6Foobar can be parsed into the string "Foo", "Bar" and "Foobar" but FooBarFoobar could not be.) In fact, BinaryReader.ReadString uses exactly this information to read a string from a binary file.

From the BinaryWriter.Write(char[]) docs:

Writes a character array to the current stream and advances the current position of the stream in accordance with the Encoding used and the specific characters being written to the stream.

It is hard to overstate how comprehensive and useful the docs on MSDN are. Always check them first.




回答3:


As already stated, BinaryWriter.Write(String) writes the length of the string to the stream, before writing the string itself.

This allows the BinaryReader.ReadString() to know how long the string is.

using (BinaryReader br = new BinaryReader(File.OpenRead("data.dat")))
{
    string foo1 = br.ReadString();
    char[] foo2 = br.ReadChars(3);
}



回答4:


Did you look at what was actually written? I'd guess a null terminator.



来源:https://stackoverflow.com/questions/1014727/difference-in-writing-string-vs-char-array-with-system-io-binarywriter

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!