问题
I would like to unzip and parse an xml file located here
Here is my code:
HttpClientHandler handler = new HttpClientHandler()
{
CookieContainer = new CookieContainer(),
UseCookies = true,
AutomaticDecompression = DecompressionMethods.GZip | DecompressionMethods.Deflate,
// | DecompressionMethods.None,
};
using (var http = new HttpClient(handler))
{
var response =
http.GetAsync(@"https://login.tradedoubler.com/report/published/aAffiliateEventBreakdownReportWithPLC_806880712_4446152766894956100.xml.zip").Result;
Stream streamContent = response.Content.ReadAsStreamAsync().Result;
using (var gZipStream = new GZipStream(streamContent, CompressionMode.Decompress))
{
var settings = new XmlReaderSettings()
{
DtdProcessing = DtdProcessing.Ignore
};
var reader = XmlReader.Create(gZipStream, settings);
reader.MoveToContent();
XElement root = XElement.ReadFrom(reader) as XElement;
}
}
I get an exception on XmlReader.Create(gZipStream, settings)
The magic number in GZip header is not correct. Make sure you are passing in a GZip stream
To double check that I am getting properly formatted data from the web, I grab the stream and save it to a file:
byte[] byteContent = response.Content.ReadAsByteArrayAsync().Result;
File.WriteAllBytes(@"C:\\temp\1111.zip", byteContent);
After I examine 1111.zip, it appears as a well formatted zip file with the xml that I need.
I was advised here that I do not need GZipStream at all but if I remove compression stream from the code completely, and pass streamContent directly to xml reader, I get an exception:
"Data at the root level is invalid. Line 1, position 1."
Either compressed or not compressed, I still fail to parse this file. What am I doing wrong?
回答1:
The file in question is encoded in PKZip format, not GZip format.
You'll need a different library to decompress it, such as System.IO.Compression.ZipFile.
You can typically tell the encoding by the file extension. PKZip files often use .zip
while GZip files often use .gz
.
See: Unzip files programmatically in .net
回答2:
After you save stream to local folder, unzip it with ZipFile class. Something like this:
byte[] byteContent = response.Content.ReadAsByteArrayAsync().Result;
string filename = @"C:\temp\1111.zip";
File.WriteAllBytes(filename, byteContent);
string destinationDir = @"c:\temp";
string xmlFilename = "report.xml";
System.IO.Compression.ZipFile.ExtractToDirectory(filename, destinationDir);
XmlDocument xmlDoc = new XmlDocument();
xmlDoc.Load(Path.Combine(destinationDir, xmlFilename));
//xml reading goes here...
来源:https://stackoverflow.com/questions/41353557/download-and-unzip-xml-file