Error: Unsupported format, or corrupt file: Expected BOF record

落爺英雄遲暮 提交于 2019-11-27 15:32:08

The error message relates to the BOF (Beginning of File) record of an XLS file. However, the example shows that you are trying to read an XLSX file.

There are 2 possible reasons for this:

  1. Your version of xlrd is old and doesn't support reading xlsx files.
  2. The XLSX file is encrypted and thus stored in the OLE Compound Document format, rather than a zip format, making it appear to xlrd as an older format XLS file.

Double check that you are in fact using a recent version of xlrd. Opening a new XLSX file with data in just one cell should verify that.

However, I would guess the you are encountering the second condition and that the file is encrypted since you state above that you are already using xlrd version 0.9.2.

XLSX files are encrypted if you explicitly apply a workbook password but also if you password protect some of the worksheet elements. As such it is possible to have an encrypted XLSX file even if you don't need a password to open it.

Update: See @BStew's, third, more probable, answer, that the file is open by Excel.

BStew

There is also a third reason. The case when the file is already open by Excel. It generates the same error.

And maybe the fourth reason, you used read_excel to read a csv file. (That't what happened to me...)

You can get this error when the xlsx file is actually html; you can open it with a text editor to verify this. When I got this error I solved it using pandas:

import pandas as pd
df_list = pd.read_html('filename.xlsx')
df = pd.DataFrame(df_list[0])

In my case, the issue was with the shared folder itself.

CASE IN POINT: I have a shared folder on WIN2012 Server where the user drops the .xlsx file and then uses my python script to load that xlsx file into a database table.

Even though, the user deleted the old file and put in the file that was to be loaded, the BOF error kept mentioning a byte string and the name of the user in the byte string -- no where inside of the xlsx file in any worksheet was there the name of the user. On top of it, when I copied the .xlsx into a newly created folder and ran the script referencing that new folder, it worked.

So in the end, I deleted the shared folder and realized that 5 items got deleted even though only 1 item was visible to me and the user. I think it is down to my lack of windows administration skills but that was the culprit.

I got the same error message. It looks so weird to me because the script works for the xlsx files under another folder and the files are almost the same.

I still don't know why this happened. But finally, I copied all the excel files to another folder and the script worked. An option to try if none of the above suggestions works for you...

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!