.NET compression of XML to store in SQL Server database

痞子三分冷 提交于 2019-11-28 05:01:43

问题


Currently our .NET application constructs XML data in memory that we persist to a SQL Server database. The XElement object is converted to a string using ToString() and then stored in a varchar(MAX) column in the DB. We dind't want to use the SQL XML datatype as we didn't need any validation and SQL doesn't need to query the XML at any stage.

Although this implementation works fine, we want to reduce the size of the database by compressing the XML before storing it, and decompressing it after retrieving it. Does anyone have any sample code for compressing an XElement object (and decompressing would be great too)? Also, what changes would I need to make to the data type of the database column so that we can fully take advantage of this compression?

I have investigated again the XML datatype SQL Server 2005 offers, and the validation overhead it offers is too high for us to consider using it. Also, although it does compress the XML somewhat, it doesn't as much compression as the .NET DeflateStream class.

I have tested the DeflateStream class by writing the XML we use to disk, and then saving the comrpessed version as a new file. The results are great, a 16kb file goes down to a 3kb file, so it's jsut a case of getting this to work in memory and saving the resulting data to the DB. Does anyone have any sample code to do the compression, and should I change the varcahr(MAX) colum to type to maybe varbinary?

Thanks in advance


回答1:


This article may help you get a start.

The following snippet can compress a string and return a base-64 coded result:

public static string Compress(string text)
{
 byte[] buffer = Encoding.UTF8.GetBytes(text);
 MemoryStream ms = new MemoryStream();
 using (GZipStream zip = new GZipStream(ms, CompressionMode.Compress, true))
 {
  zip.Write(buffer, 0, buffer.Length);
 }

 ms.Position = 0;
 MemoryStream outStream = new MemoryStream();

 byte[] compressed = new byte[ms.Length];
 ms.Read(compressed, 0, compressed.Length);

 byte[] gzBuffer = new byte[compressed.Length + 4];
 System.Buffer.BlockCopy(compressed, 0, gzBuffer, 4, compressed.Length);
 System.Buffer.BlockCopy(BitConverter.GetBytes(buffer.Length), 0, gzBuffer, 0, 4);
 return Convert.ToBase64String (gzBuffer);
}

EDIT: As an aside, you may want to use CLOB formats even when storing XML as text because varchars have a very limited length - which XML can often quickly exceed.




回答2:


I think you should also re-test the XML column. It stores in binary, I know, not as text. It could be smaller, and may not perform badly, even if you don't actually need the additional features.




回答3:


Besides possibly compressing the string itself (perhaps using LBushkin's Base64 method above), you probably want to start with making sure you kill all the whitespace. The default XElement.ToString() method saves the element with "indenting". You need to use the ToString(SaveOptions options) method (using SaveOptions.DisableFormatting) if you want to make sure you've just got the tags and data.




回答4:


I know you tagged the question SQL 2005, but you should consider upgrading to SQL 2008 and using the wonderful new compression capabilities that come with it. Is out-of-the-box, transparent for your application and will save you a huge implementation/test/support cost.



来源:https://stackoverflow.com/questions/1089150/net-compression-of-xml-to-store-in-sql-server-database

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!