问题
I'm trying to work around the problem with connection string encoding in Firebird .net provider ver >= 5.6.0.0 (current is 5.8.0.0). The full description of the problem (if you are interested in) is here, but I think I could explain it briefly. So let's start! I have a system default encoding win1251 and a connection string that contains a param calls "DbPath" with value
"F:\\Рабочая\\БД\\2.14.1\\January_2017\\MYDB.IB"
When I pass this connection string to firebird .net provider it takes "DbPath" param from connection string and get bytes from its value using Encoding.UTF-8. That's how it looks in their code:
protected virtual void SendAttachToBuffer(DatabaseParameterBuffer dpb, string database)
{
XdrStream.Write(IscCodes.op_attach);
XdrStream.Write(0);
if (!string.IsNullOrEmpty(Password))
{
dpb.Append(IscCodes.isc_dpb_password, Password);
}
//database is DbPath
XdrStream.WriteBuffer(Encoding.UTF8.GetBytes(database));
XdrStream.WriteBuffer(dpb.ToArray());
}
As you see they don't convert encoding from win1251 to utf-8, they just get bytes using Encoding.UTF8.GetBytes();
And later in their code I see that they just get a string using current Encoding (Encoding.Default):
public string GetString(byte[] buffer, int index, int count)
{
//_encoding is Encoding.Default == win1251
return _encoding.GetString(buffer, index, count);
}
And the result of this lines of code is that I get an I/O Exception cause my DbPath becomes to
"F:\\Рабочая\\БД\\2.14.1\\January_2017\\MYDB.IB"
So the first thing that I've tried is to convert my connection string to utf-8 using this lines of code:
private static string Win1251ToUTF8(string source)
{
Encoding utf8 = Encoding.GetEncoding("utf-8");
Encoding win1251 = Encoding.GetEncoding("windows-1251");
byte[] win1251Bytes = win1251.GetBytes(source);
byte[] utf8bytes = Encoding.Convert(win1251, utf8, win1251Bytes);
source = utf8.GetString(utf8bytes);
return source;
//Actually I'm not sure that I'm converting Encoding correctly
}
But it didn't affect. I've tried many variants with Encoding.Convert but I've not a solution yet. Can someone tell me please what I'm doing wrong and how I can solve the problem. Regards.
回答1:
I recommend you to try the following code, maybe it helps you. Create a new C# WindowsFormApplication, put a BIG multiline texBox "textBox1" and a button "button1" on it. In the button click handler put this code:
// ----- The work -------------------------------------------------
string source = "F:\\\\Рабочая\\\\БД\\\\2.14.1\\\\January_2017\\\\MYDB.IB";
Encoding utf8 = Encoding.UTF8;
Encoding unicode = Encoding.Unicode;
Encoding win1251 = Encoding.GetEncoding("windows-1251");
byte[] utf8Bytes = utf8.GetBytes(source);
byte[] win1251Bytes = win1251.GetBytes(source);
byte[] utf8ofwinBytes = Encoding.Convert(win1251, utf8, win1251Bytes);
string unicodefromutf8 = utf8.GetString(utf8Bytes);
string unicodefromwin1251 = win1251.GetString(win1251Bytes);
// ----- The show -------------------------------------------------
textBox1.Text = "";
textBox1.Text += "Literal Unicode soource" + Environment.NewLine;
textBox1.Text += source + Environment.NewLine + Environment.NewLine;
string s1 = "";
textBox1.Text += "UTF8" + Environment.NewLine;
for (int i = 0; i < utf8Bytes.Length; i++)
{
s1 += utf8Bytes[i].ToString() + ", ";
}
textBox1.Text += s1 + Environment.NewLine + Environment.NewLine;
s1 = "";
textBox1.Text += "WIN 1251" + Environment.NewLine;
for (int i = 0; i < win1251Bytes.Length; i++)
{
s1 += win1251Bytes[i].ToString() + ", ";
}
textBox1.Text += s1 + Environment.NewLine + Environment.NewLine;
s1 = "";
textBox1.Text += "UTF8 of WIN 1251" + Environment.NewLine;
for (int i = 0; i < utf8ofwinBytes.Length; i++)
{
s1 += utf8ofwinBytes[i].ToString() + ", ";
}
textBox1.Text += s1 + Environment.NewLine + Environment.NewLine;
textBox1.Text += "Unicode string of UTF8 bytes" + Environment.NewLine;
textBox1.Text += unicodefromutf8 + Environment.NewLine + Environment.NewLine;
textBox1.Text += "Unicode string of WIN 1251 bytes" + Environment.NewLine;
textBox1.Text += unicodefromwin1251 + Environment.NewLine + Environment.NewLine;
Run it, click the button and you will see, all converting, encoding is done as it should.
You asked for a way to convert Unicode to UTF8 to WIN1251 to UTF8 to UNICODE - here it is.
Your misunderstanding may be:
source = utf8.GetString(utf8bytes);
return source;
This will convert the created UTF8 byte sequence array to an Unicode string. So you return an Unicode string, not a UTF8-byte-sequence of your win-1251 string. Exactly, you return the same string you get.
You have to push the (proper zero terminated) UTF8-byte-sequence to the .Net provider.
回答2:
Use Encoding.Convert to convert charsets:
Encoding utf8 = Encoding.UTF8;
Encoding win = Encoding.GetEncoding("windows-1251");
byte[] winBytes = win.GetBytes(source);
byte[] utfBytes = Encoding.Convert(win, utf8, winBytes);
string result = utf8.GetString(utfBytes);
来源:https://stackoverflow.com/questions/42884025/c-sharp-encode-connection-string-from-win1251-to-utf8-and-back