C#: What takes up more memory? A string or bytearray?

ぐ巨炮叔叔 提交于 2021-02-08 13:09:52

问题


C#: What takes up more memory? A string or bytes?

Let's say I have a line that reads "My Text", in which form would that line use up more memory, as a byte or a string?


回答1:


The byte array. This will store your text as ASCII (1 byte per character) characters, whereas a .NET string uses Unicode which are larger. However remember that .NET strings are probably more useful and in a large application the difference probably won't make a huge difference.

(note also that if you just use ASCII characters in your .NET string then the characters will still only be 1 byte each)




回答2:


It depends on the character encoding of the byte array. You can convert any string into an array of bytes, but you have to choose the encoding; there is no single standard or correct encoding. What used to be called ASCII is no use outside of the English speaking world.

In most encodings, "My Text" would be 7 bytes long. But throw in some European accented characters, or Japanese characters, and those (if they can be represented at all) may be more than one or two bytes each. In some encodings, with some text strings, the byte-array representation may be larger than the internal Unicode representation used by System.String.




回答3:


Being Unicode doesn't mean that the string will take more than one byte per character, it just means it "could" take up more than one byte per character.

http://www.joelonsoftware.com/articles/Unicode.html




回答4:


What takes up more memory?

So you are asking about the size of the in-memory representation. .net uses UTF-16 for strings, which means your example will be represented by 14 bytes, as can be seen in this hex dump (UTF-16LE):

4d 00 79 00 20 00 54 00  65 00 78 00 74 00

The size of the byte array will depend on the encoding that you use to represent the text. If you use UTF-16, like this

Encoding.Unicode.GetBytes(string)

you obviously get the same 14 bytes. If you use UTF-8 instead:

Encoding.UTF8.GetBytes(string)

you get an array of 7 bytes:

4d 79 20 54 65 78 74

This is the same size (and the same representation) as ASCII, because your example only uses characters that are available in the ASCII charset. All those characters are, per definition, the same in UTF-8.

Now if you use non-ASCII characters instead, say the Japanese "日", the UTF-8 encoding would need 3 bytes:

e6 97 a5

UTF-16 would need only 2 bytes:

e5 65

Trying to convert the Japanese character to ASCII would yield an exception or just use a "?" character, depending on how you configure the Encoding, because ASCII cannot represent anything but ASCII characters.

Another slightly different example, the European character "ä". 2 bytes in UTF-8:

c3 a4

Also 2 bytes in UTF-16:

e4 00

ASCII can't represent this character.

To sum up, the memory consumed depends on the actual data in your strings and what encoding you use to represent it.

All the above talks about the memory consumption for the raw data only, please note that in order to calculate the total memory consumption you'd also have to include metadata that is part of every array and string, like its length, and, in the case of .net strings, also a null terminator (2 additional bytes with the value '0'). The number of bytes for the metadata is constant and relatively small, so any difference between string and array there would only matter if you had tons of very small texts.




回答5:


Both are pretty close. Only one real answer:

Profile it on your framework/architecture.




回答6:


The byte array would take less memory unless you had several copies of the string, in which case the string would take up less memory thanks to the string table.

But the real questions is, does it really matter? There are a lot of benefits you get to using the string as a string, rather than storing it as an array of bytes.

I don't know the particulars, since your question was very narrow, but I smell premature optimization.




回答7:


There's a good blog post here that gives an equation for how much space a string takes up, as well as various interactions with StringBuilder & instance allocations



来源:https://stackoverflow.com/questions/913036/c-what-takes-up-more-memory-a-string-or-bytearray

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!