How to identify doc, docx, pdf, xls and xlsx based on file header

后端 未结 4 1250
暖寄归人
暖寄归人 2020-12-28 18:28

How to identify doc, docx, pdf, xls and xlsx based on file header in C#? I don\'t want to rely on the file extensions neither MimeMapping.GetMimeMapping for this as either o

4条回答
  •  谎友^
    谎友^ (楼主)
    2020-12-28 19:00

    user2173353 has what appears to be the correct solution for detecting the new Office .docx / .xlsx formats. To add some details to this, the below check appears to identify these correctly:

        /// 
        /// MS .docx, .xslx and other extensions are (correctly) identified as zip files using signature lookup.
        /// This tests if System.IO.Packaging is able to open, and if package has parts, this is not a zip file.
        /// 
        /// 
        /// 
        private static bool IsPackage(this Stream stream)
        {
            Package package = Package.Open(stream, FileMode.Open, FileAccess.Read);
            return package.GetParts().Any();
        }
    

提交回复
热议问题