Hi I am trying to import a large XML file into a table on my sql server (2014)
I have used the code below for smaller files and thought it would be ok as this is a o
The max size of an XML column value in SQL Server is 2GB. It will not be possible to import a 2.5GB file into a single XML column.
UPDATE
Since your underlying objective is to transform XML elements within the file into table rows, you don't need to stage the entire file contents into a single XML column. You can avoid the 2GB limitation, reduce memory requirements, and improve performance by shredding the XML in client code and using a bulk insert technique to insert batches of multiple rows.
The example Powershell script below uses an XmlTextReader to avoid reading the entire XML into a DOM and uses SqlBulkCopy to insert batches of many rows at once. The combination of these techniques should allow you to insert millions rows in minutes rather than hours. These same techniques can be implemented in a custom app or SSIS script task.
I noticed a couple of the table columns specify varchar(1)
yet the XML attribute values contain many characters. You'll need to either expand length of the columns or transform the source values.
[String]$global:connectionString = "Data Source=YourServer;Initial Catalog=YourDatabase;Integrated Security=SSPI";
[System.Data.DataTable]$global:dt = New-Object System.Data.DataTable;
[System.Xml.XmlTextReader]$global:xmlReader = New-Object System.Xml.XmlTextReader("C:\FilesToImport\files.xml");
[Int32]$global:batchSize = 10000;
Function Add-FileRow() {
$newRow = $dt.NewRow();
$null = $dt.Rows.Add($newRow);
$newRow["Product_ID"] = $global:xmlReader.GetAttribute("Product_ID");
$newRow["path"] = $global:xmlReader.GetAttribute("path");
$newRow["Updated"] = $global:xmlReader.GetAttribute("Updated");
$newRow["Quality"] = $global:xmlReader.GetAttribute("Quality");
$newRow["Supplier_id"] = $global:xmlReader.GetAttribute("Supplier_id");
$newRow["Prod_ID"] = $global:xmlReader.GetAttribute("Prod_ID");
$newRow["Catid"] = $global:xmlReader.GetAttribute("Catid");
$newRow["On_Market"] = $global:xmlReader.GetAttribute("On_Market");
$newRow["Model_Name"] = $global:xmlReader.GetAttribute("Model_Name");
$newRow["Product_View"] = $global:xmlReader.GetAttribute("Product_View");
$newRow["HighPic"] = $global:xmlReader.GetAttribute("HighPic");
$newRow["HighPicSize"] = $global:xmlReader.GetAttribute("HighPicSize");
$newRow["HighPicWidth"] = $global:xmlReader.GetAttribute("HighPicWidth");
$newRow["HighPicHeight"] = $global:xmlReader.GetAttribute("HighPicHeight");
$newRow["Date_Added"] = $global:xmlReader.GetAttribute("Date_Added");
}
try
{
# init data table schema
$da = New-Object System.Data.SqlClient.SqlDataAdapter("SELECT * FROM dbo.files_index WHERE 0 = 1;", $global:connectionString);
$null = $da.Fill($global:dt);
$bcp = New-Object System.Data.SqlClient.SqlBulkCopy($global:connectionString);
$bcp.DestinationTableName = "dbo.files_index";
$recordCount = 0;
while($xmlReader.Read() -eq $true)
{
if(($xmlReader.NodeType -eq [System.Xml.XmlNodeType]::Element) -and ($xmlReader.Name -eq "file"))
{
Add-FileRow -xmlReader $xmlReader;
$recordCount += 1;
if(($recordCount % $global:batchSize) -eq 0)
{
$bcp.WriteToServer($dt);
$dt.Rows.Clear();
Write-Host "$recordCount file elements processed so far";
}
}
}
if($dt.Rows.Count -gt 0)
{
$bcp.WriteToServer($dt);
}
$bcp.Close();
$xmlReader.Close();
Write-Host "$recordCount file elements imported";
}
catch
{
throw;
}