How to identify doc, docx, pdf, xls and xlsx based on file header in C#? I don\'t want to rely on the file extensions neither MimeMapping.GetMimeMapping for this as either o
user2173353 has what appears to be the correct solution for detecting the new Office .docx / .xlsx formats. To add some details to this, the below check appears to identify these correctly:
///
/// MS .docx, .xslx and other extensions are (correctly) identified as zip files using signature lookup.
/// This tests if System.IO.Packaging is able to open, and if package has parts, this is not a zip file.
///
///
///
private static bool IsPackage(this Stream stream)
{
Package package = Package.Open(stream, FileMode.Open, FileAccess.Read);
return package.GetParts().Any();
}