How can I convert a XLSB file to csv using python?

二次信任 提交于 2019-12-03 02:59:31
luk32

Most popular Excel python packages openpyxl and xlrd have no support for xlsb format (bug tracker entries: openpyxl, xlrd).

So I'm afraid there is no native python way =/. However, since you are using windows, it should be easy to script the task with external tools.

I would suggest taking look at Convert XLS to XLSB Programatically?. You mention python in title but the matter of the question does not imply you are strongly coupled to it, so you could go pure c# way.

If you feel really comfortable only with python one of the answers there suggests a command line tool under a fancy name of Convert-XLSB. You could script it as an external tool from python with subprocess.

I know this is not a good answer, but I don't think there is better/easier way as of now.

I've encountered this same problem and using pyxlsb does it for me:

from pyxlsb import open_workbook

with open_workbook('HugeDataFile.xlsb') as wb:
    for sheetname in wb.sheets:
        with wb.get_sheet(sheetname) as sheet:
            for row in sheet.rows():
                values = [r.v for r in row]  # retrieving content
                csv_line = ','.join(values)  # or do your thing

I also looked at the problem and the following worked for me. First opening the file in excel via python and than saving it to different file. Bit of a workaround but I like it more than other solutions. In example I use file format 6 which is CSV but you can also use other ones.

import win32com.client
excel = win32com.client.Dispatch("Excel.Application")
excel.DisplayAlerts = False
excel.Visible=False
doc = excel.Workbooks.Open("C:/users/A295998/Python/@TA1PROG3.xlsb")
doc.SaveAs(Filename="C:\\users\\A295998\\Python\\test5.csv",FileFormat=6)
doc.Close()
excel.Quit()

In my previous experience, i was handling converting xlsb using libreoffice command line utility,

In ruby i just execute system command to call libreoffice for converting xlsb format to csv:

`libreoffice --headless --convert-to csv your_csv_file.csv --outdir /path/csv`

and to change the encoding i use command line to using iconv, using ruby :

`iconv -f ISO-8859-1 -t UTF-8 your_csv_file.csv > new_file_csv.csv`

XLSB is a binary format and I don't think you'll be able to parse it with current python tools and packages. If you still want to somehow automate the process with python you can do what the others have told you and script that windows CLI tool. Calling the .exe from the command line with subprocess, and passing an array of the files you want to convert.

I.e: with a script similar to this one you could convert all the .xlsb files that you place in the "xlsb" folder to .csv format...

├── xlsb
│   ├── file1.xlsb
│   ├── file2.xlsb
│   └── file3.xlsb
└── xlsb_to_csv.py


xlsb_to_csv.py

#!/usr/bin/env python

import os

files = [f for f in os.listdir('./xlsb')]
for f in files:
    subprocess.call("ConvertXLS.EXE " + str(f) + " --arguments", shell=True)

Note: the Windows command is pseudocode... I use a similar approach to batch-convert stuff in headless windows servers for testing purpouses. You just have to figure out the exe location and the windows command...

Hope it helps... good luck!

I think you can do this using pyuno. This blog entry shows how to convert xls files to csv, and as open office supports xlsb files since version 3.2, this code might just work for you. You will have to go through hassle of setting up the pyuno environment though..

Mixopteryx

The script you reference seem to use the ActiveX interface to Excel, and save via its Workbook.SaveAs method. According to the MSDN documentation this method have a TextCodepage argument which may be helpful.

Sidenote: You can rewrite the VB script in python, see this question.

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!