Libraries for ADO.NET to rapidly bulk insert data into a database from a .csv file?

后端 未结 2 1225
抹茶落季
抹茶落季 2020-12-05 12:01

I\'d like to know if you can recommend any advanced ADO.NET libraries for working with databases.

I\'ve discovered that LINQ-to-Entities is great for pulling data ou

相关标签:
2条回答
  • 2020-12-05 12:59

    The ADO.net SqlBulkCopy class enables quick, mass upload of records into a table:

        DataTable dt = s_EmptyUploadTable.Copy();
        foreach (var itm in yourList) {
            DataRow row = dt.NewRow();
            row["Field1"] = itm.Field1;
            row["Field2"] = itm.Field2;
            dt.Rows.Add(row);
        }
        using (SqlConnection cn = new SqlConnection(yourConnectionString)) {
            cn.Open();
            using (SqlBulkCopy bulkCopy = new SqlBulkCopy(cn)) {
                bulkCopy.DestinationTableName = "dbo.YourActualSQLServerTableName";
                bulkCopy.WriteToServer(dt);
            }
            cn.Close();
        }
    
    0 讨论(0)
  • 2020-12-05 13:04

    You can use LINQ Entity Data Reader to write an IEnumerable list to a database using SQL Bulk Copy behind the scenes. You can use this library to bulk upload the results of a LINQ query straight into the database, because the results of a LINQ query are IEnumerable.

    As there are LINQ-to-everything adapters, you can do tricks like use the LINQ to CSV library to grab the data out of a .csv file using a LINQ query, then the LINQ Entity Data Reader to bulk write this data directly into the database.

    Case study:

    Problem: read a .csv file quickly into a database. The connection to the SQL database is via LINQ-to-Entitys from C#.

    Solution 1: Use LINQ to CSV library, construct a LINQ query to pull out the data you want, then write it in using the standard LINQ-to-Entity calls (ctx.AddObject(), ctx.SaveChanges(), etc). Time taken: 30 seconds for 20,000 records, as LINQ ends up generating a query for every single record (slooooow!!!!!).

    Solution 2: Use LINQ to CSV library, construct a LINQ query to pull out the data you want into an IEnumerable, use LINQ Entity Data Reader to bulk write this data directly into the target data table. Time taken: 3 seconds for 20,000 records.

    Solution 3: Use a a stored procedure with SQL "bulk copy". Time taken: 2 seconds for 20,000 records. However, this solution is quite brittle as it relies on a stored procedure, and SQL bulk copy is just not compatible with some .csv file formats. This method also requires that you use a staging table between the actual target table and the .csv file, to deal with file formatting issues and to help with normalization.

    And, here is the source code for solution #2:

    static void WriteCSVtoSQLtable()
    {
      // Step 1: Read .csv file into IEnumerable using LINQ-to-CSV class.
    
      // This section requires "LINQtoCSV" class as described at http://www.codeproject.com/KB/linq/LINQtoCSV.asp
    
      string inputFilePath = @"T:\x.csv";
    
      CsvFileDescription inputFileDescription = new CsvFileDescription
      {
        SeparatorChar = ',',
        FirstLineHasColumnNames = true
      };
    
      IEnumerable<MyCustomColumnMappingClass> csvChains = cc.Read<MyCustomColumnMappingClass>(inputFilePath, inputFileDescription);
    
      // Step 2: Now write into the target table on SQL Server.
    
      // This section requires "EntityDataReader" class described at http://archive.msdn.microsoft.com/LinqEntityDataReader.
      public static string dbSqlConnectionString = @";Data Source=(local);Initial Catalog=PhiEngine;Integrated Security=True;MultipleActiveResultSets=True";
    
      SqlConnection dbSql(dbSqlConnectionString);
    
      using (var tran = dbSql.BeginTransaction())
      {
    
        var csvFile = from p in csvChains
                    select p;
    
        SqlBulkCopy bc = new SqlBulkCopy(dbSql,
          SqlBulkCopyOptions.CheckConstraints |
          SqlBulkCopyOptions.FireTriggers |
          SqlBulkCopyOptions.KeepNulls, tran)
                         {
                           BatchSize = 1000,
                           DestinationTableName = "TStagingTable" // Temporary staging table in database.
                         };
    
        bc.WriteToServer(csvFile.AsDataReader()); // Extension method .AsDataReader depends on adding the EntityDataReader class to your C# project (see above).
    
        tran.Commit();
      }
    }
    
    // This class is used by LINQ to CSV to query the .csv file, see "LINQtoCSV" website.
    public class MyCustomColumnMappingClass
    {
      [CsvColumn(Name = "symbol", FieldIndex = 1)]
      public string Symbol { get; set; }
    
      [CsvColumn(Name = "date", FieldIndex = 3, OutputFormat = @"MM/dd/yyyy")]
      public DateTime Date { get; set; }
    }
    
    0 讨论(0)
提交回复
热议问题