问题
What is the best strategy to read millions of records from a table (in SQL Server 2012, BI instance), in a streaming fashion (like SQL Server Management Studio does)?
I need to cache these records locally (C# console application) for further processing.
Update - Sample code that works with SqlDataReader
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Data;
using System.Data.SqlClient;
using System.Threading;
namespace ReadMillionsOfRows
{
class Program
{
static ManualResetEvent done = new ManualResetEvent(false);
static void Main(string[] args)
{
Process();
done.WaitOne();
}
public static async Task Process()
{
string connString = @"Server=;Database=;User Id=;Password=;Asynchronous Processing=True";
string sql = "Select * from tab_abc";
using (SqlConnection conn = new SqlConnection(connString))
{
await conn.OpenAsync();
using (SqlCommand comm = new SqlCommand(sql))
{
comm.Connection = conn;
comm.CommandType = CommandType.Text;
using (SqlDataReader reader = await comm.ExecuteReaderAsync())
{
while (await reader.ReadAsync())
{
//process it here
}
}
}
}
done.Set();
}
}
}
回答1:
Use a SqlDataReader it is forward only and fast. It will only hold a reference to a record while it is in the scope of reading it.
回答2:
That depends on what your cache looks like. If you're going to store everything in memory, and a DataSet is approriate as a cache, just read everything to the DataSet.
If not, use the SqlDataReader as suggested above, read the records one by one storing them in your big cache.
Do note, however, that there's already a very popular caching mechanism for large database tables - your database. With the proper index configuration, the database can probably outperform your cache.
回答3:
You can use Entity Framework and paginate the select using Take and Skip to fetch the rows by buffer. If you need in memory caching for such a large dataset I would suggest using GC.GetTotalMemory in order to test if there is any free memory left.
来源:https://stackoverflow.com/questions/13045481/streaming-read-of-over-10-million-rows-from-a-table-in-sql-server