Determine if a byte[] is a pdf file

筅森魡賤 提交于 2019-12-17 22:23:26

问题


Is there any way of checking if a byte[] is a pdf without opening?

I have some code to display a list of byte[] as pdf thumbnails. I previously knew all the byte[] were pdf's because we filtered the servlet to only return these. Now the requirement has changed and I need to bring all file types back. Is there any way of checking what the byte[] is, or more specifically determining if it isn't, a pdf?


回答1:


Check the first 4 bytes of the array.

If those are 0x25 0x50 0x44 0x46 then it's most probably a PDF file.




回答2:


First four bytes should be: 0x25 0x50 0x44 0x46 (in hex format, in ASCII it's %PDF). "Magic numbers" for another formats you can find here




回答3:


As far as I know all PDF's start with %PDF, so you could check the first bytes against this string.




回答4:


While the marked answer and the other answers are correct, they will not be successful 100% of the time. The problem is the PDF spec says the %PDF-1.x only needs to be in the first 1024 bytes and not the first 4. Some programs will add information before %PDF and still be valid.

I would recommend seeing the answer for the following Stack Overflow question: How to detect if a file is PDF or TIFF?



来源:https://stackoverflow.com/questions/6186980/determine-if-a-byte-is-a-pdf-file

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!