I have been experimenting with ways to read data from a SQL server as quickly as possible and I came across an interesting discovery. If I read the data into a List<object[]>
instead of a List<string[]>
, performance increases by more than double.
I suspect this is due to not having to call the ToString()
method on the fields, but I always thought that using objects had a negative impact on performance.
Is there any reason to not use a list of object arrays instead of string arrays?
EDIT: One thought I just had was the storage size of this data. Will storing the data in object arrays take more room than as strings?
Here is my test code:
private void executeSqlObject()
{
List<object[]> list = new List<object[]>();
using (SqlConnection cnn = new SqlConnection(_cnnString))
{
cnn.Open();
SqlCommand cmd = new SqlCommand("select * from test_table", cnn);
SqlDataReader reader = cmd.ExecuteReader();
int fieldCount = reader.FieldCount;
while (reader.Read())
{
object[] row = new object[fieldCount];
for (int i = 0; i < fieldCount; i++)
{
row[i] = reader[i];
}
list.Add(row);
}
}
}
private void executeSqlString()
{
List<string[]> list = new List<string[]>();
using (SqlConnection cnn = new SqlConnection(_cnnString))
{
cnn.Open();
SqlCommand cmd = new SqlCommand("select * from test_table", cnn);
SqlDataReader reader = cmd.ExecuteReader();
int fieldCount = reader.FieldCount;
while (reader.Read())
{
string[] row = new string[fieldCount];
for (int i = 0; i < fieldCount; i++)
{
row[i] = reader[i].ToString();
}
list.Add(row);
}
}
}
private void runTests()
{
Stopwatch watch = new Stopwatch();
for (int i = 0; i < 10; i++)
{
watch.Start();
executeSqlObject();
Debug.WriteLine("Object Time: " + watch.ElapsedMilliseconds.ToString());
watch.Reset();
}
for (int i = 0; i < 10; i++)
{
watch.Start();
executeSqlString();
Debug.WriteLine("String Time: " + watch.ElapsedMilliseconds.ToString());
watch.Reset();
}
}
And the results:
Object Time: 879
Object Time: 812
Object Time: 825
Object Time: 882
Object Time: 880
Object Time: 905
Object Time: 815
Object Time: 799
Object Time: 823
Object Time: 817
Average: 844
String Time: 1819
String Time: 1790
String Time: 1787
String Time: 1856
String Time: 1795
String Time: 1731
String Time: 1792
String Time: 1799
String Time: 1762
String Time: 1869
Average: 1800
object
only adds overhead if you are causing additional boxing. And even then, this impact is fairly minimal. In your case, reader[i]
always returns object
. You already have it as object
, no matter whether that is a reference to a string, or an int, etc. Of course calling .ToString()
adds overhead; in most cases (int, DateTime, etc) this involves both formatting code and the allocation of one (or more) extra string. By changing to string
you are changing the data (for the worse, IMO - for example, you can no longer do correct sorts on dates, for example) and adding overhead. The edge case here is if all the columns are already actually strings - in which case you just add a few virtual method calls (but no extra real work).
For info, if you are after raw performance, I thoroughly recommend looking at the micro-ORMs such as dapper. They are heavily optimised, but avoid the weight of "full" ORMs. For example, in dapper:
var myData = connection.Query<TypedObject>("select * from test_table").ToList();
will, I expect, perform very comparably while giving you strongly typed object data.
Is there any reason to not use a list of object arrays instead of string arrays?
It would depend on what you wanted to do with the retrieved values after you got them into the arrays, if you're happy to treat each value as an object then having a list of objects is fine, but if you want to treat them as strings then at some point you're going to have to convert/cast the object back to a string so you're going to incur the cost somewhere.
As Cory mentioned if you're reading the value as a string from the SqlDataReader you should test using the GetString(int) method rather than calling ToString() on the value, and use this as the benchmark.
Alternatively, rather than use arrays you can read the values into a DataSet which may prove easier to work with afterwards.
End of the day, what's the best depends a lot on how you want to use the results after retrieving them from the database.
来源:https://stackoverflow.com/questions/8030564/sqldatareader-performance-liststring-or-listobject