why doesn't byte[] to string and back work as expected

倖福魔咒の 提交于 2019-12-30 06:34:49

问题


I have this code:

Int32 i1 = 14000000;
byte[] b = BitConverter.GetBytes(i1);
string s = System.Text.Encoding.UTF8.GetString(b);
byte[] b2 = System.Text.Encoding.UTF8.GetBytes(s);
Int32 i2 = BitConverter.ToInt32(b2,0);;

i2 is equal to -272777233. Why isn't it the input value? (14000000) ?

EDIT: what I am trying to do is append it to another string which I'm then writing to file using WriteAllText


回答1:


Because an Encoding class is not going to just work for anything. If a "character" (possibly a few bytes in case of UTF-8) is not a valid character in that particular character set (in your case UTF-8), it will use a replacement character.

a single QUESTION MARK (U+003F)

(Source: http://msdn.microsoft.com/en-us/library/ms404377.aspx#FallbackStrategy)

Some case it is just a ?, for example in ASCII/CP437/ISO 8859-1, but there is a way for you to choose what to do with it. (See the link above)

For example if you try to convert (byte)128 to ASCII:

string s = System.Text.Encoding.ASCII.GetString(new byte[] { 48, 128 }); // s = "0?"

Then convert it back:

byte[] b = System.Text.Encoding.ASCII.GetBytes(s); // b = new byte[] { 48, 63 }

You will not get the original byte array.

This can be a reference: Check if character exists in encoding


I can't imagine why you would need to convert a byte array to a string. It obviously doesn't make any sense. Let's say you're going to write to a stream, you could just directly write byte[]. If you need to use it in some text representation, it makes perfect sense to just convert it to a string by yourIntegerVar.ToString() and use int.TryParse to get it back.


Edit:

You can write a byte array to a file, but you are not going to "concatenate" the byte array to a string and use the lazy method File.WriteAllText because it is going to handle the encoding conversion and you will probably end up having question marks ? all over your file. Instead, Open a FileStream and use FileStream.Write to directly write the byte array. Alternatively, you can use a BinaryWriter to directly write an integer in its binary form (and also a string) and use its counterpart BinaryReader to read it back.

Example:

FileStream fs;

fs = File.OpenWrite(@"C:\blah.dat");
BinaryWriter bw = new BinaryWriter(fs, Encoding.UTF8);
bw.Write((int)12345678);
bw.Write("This is a string in UTF-8 :)"); // Note that the binaryWriter also prefix the string with its length...
bw.Close();

fs = File.OpenRead(@"C:\blah.dat");
BinaryReader br = new BinaryReader(fs, Encoding.UTF8);
int myInt = br.ReadInt32();
string blah = br.ReadString(); // ...so that it can read it back.
br.Close();

This example code will result in a file which matches the following hexdump:

00  4e 61 bc 00 1c 54 68 69 73 20 69 73 20 61 20 73  Na¼..This is a s  
10  74 72 69 6e 67 20 69 6e 20 55 54 46 2d 38 20 3a  tring in UTF-8 :  
20  29                                               )   

Note that BinaryWriter.Write(string) also prefix the string with its length and it depends on it when reading back, so it is not appropriate to use a text editor to edit the resulting file. (Well you are writing an integer in its binary form so I expect this is acceptable?)




回答2:


You shouldn't use Encoding.GetString to convert arbitrary binary data into a string. That method is only intended for text that has been encoded to binary data using a specific encoding.

Instead, you want to use a text representation which is capable of representing arbitrary binary data reversibly. The two most common ways of doing that are base64 and hex. Base64 is the simplest in .NET:

string base64 = Convert.ToBase64String(originalBytes);
...
byte[] recoveredBytes = Convert.FromBase64String(base64);

A few caveats to this:

  • If you want to use this string as a URL parameter, you should use a web-safe version of base64; I don't know of direct support for that in .NET, but you can probably find solutions easily enough
  • You should only do this at all if you really need the data in string format. If you're just trying to write it to a file or similar, it's simplest to keep it as binary data
  • Base64 isn't very human-readable; use hex if you want humans to be able to read the data in its text form without extra tooling. (There are various questions specifically about converting binary data to hex and back.)



回答3:


It's not working because you are using encoding backwards.

Encoding is used to turn text into bytes, and then back into text again. You can't take any arbitrary bytes and turn into text. Every character has a corresponding byte pattern, but every byte pattern doesn't translate into a character.

If you want a compact way to represent bytes as text, use base-64 encoding:

Int32 i1 = 14000000;
byte[] b = BitConverter.GetBytes(i1);
string s = Convert.ToBase64String(b);

byte[] b2 = Convert.FromBase64String(s);
Int32 i2 = BitConverter.ToInt32(b2, 0);



回答4:


If your goal here is to store an integer as a string then back to an integer, unless I am missing something wouldn't the following suffice:

int32 i1 = 1400000;
string s = il.ToString();
Int32 i2 = Int32.Parse(s);


来源:https://stackoverflow.com/questions/14168025/why-doesnt-byte-to-string-and-back-work-as-expected

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!