how to convert xls to xlsx

前端 未结 14 1333
生来不讨喜
生来不讨喜 2020-11-27 03:58

I have some *.xls(excel 2003) files, and I want to convert those files into xlsx(excel 2007).

I use the uno python package, when I save the documents, I can set the

14条回答
  •  猫巷女王i
    2020-11-27 04:53

    I'm improve performance for @Jackypengyu method.

    • XLSX: working per row, not per cell (http://openpyxl.readthedocs.io/en/default/api/openpyxl.worksheet.worksheet.html#openpyxl.worksheet.worksheet.Worksheet.append)
    • XLS: read whole row excluding empty tail, see ragged_rows=True (http://xlrd.readthedocs.io/en/latest/api.html#xlrd.sheet.Sheet.row_slice)

    Merged cells will be converted too.

    Results

    Convert same 12 files in same order:

    Original:

    0:00:01.958159
    0:00:02.115891
    0:00:02.018643
    0:00:02.057803
    0:00:01.267079
    0:00:01.308073
    0:00:01.245989
    0:00:01.289295
    0:00:01.273805
    0:00:01.276003
    0:00:01.293834
    0:00:01.261401
    

    Improved:

    0:00:00.774101
    0:00:00.734749
    0:00:00.741434
    0:00:00.744491
    0:00:00.320796
    0:00:00.279045
    0:00:00.315829
    0:00:00.280769
    0:00:00.316380
    0:00:00.289196
    0:00:00.347819
    0:00:00.284242
    

    Solution

    def cvt_xls_to_xlsx(*args, **kw):
        """Open and convert XLS file to openpyxl.workbook.Workbook object
    
        @param args: args for xlrd.open_workbook
        @param kw: kwargs for xlrd.open_workbook
        @return: openpyxl.workbook.Workbook
    
    
        You need -> from openpyxl.utils.cell import get_column_letter
        """
    
        book_xls = xlrd.open_workbook(*args, formatting_info=True, ragged_rows=True, **kw)
        book_xlsx = Workbook()
    
        sheet_names = book_xls.sheet_names()
        for sheet_index in range(len(sheet_names)):
            sheet_xls = book_xls.sheet_by_name(sheet_names[sheet_index])
    
            if sheet_index == 0:
                sheet_xlsx = book_xlsx.active
                sheet_xlsx.title = sheet_names[sheet_index]
            else:
                sheet_xlsx = book_xlsx.create_sheet(title=sheet_names[sheet_index])
    
            for crange in sheet_xls.merged_cells:
                rlo, rhi, clo, chi = crange
    
                sheet_xlsx.merge_cells(
                    start_row=rlo + 1, end_row=rhi,
                    start_column=clo + 1, end_column=chi,
                )
    
            def _get_xlrd_cell_value(cell):
                value = cell.value
                if cell.ctype == xlrd.XL_CELL_DATE:
                    value = datetime.datetime(*xlrd.xldate_as_tuple(value, 0))
    
                return value
    
            for row in range(sheet_xls.nrows):
                sheet_xlsx.append((
                    _get_xlrd_cell_value(cell)
                    for cell in sheet_xls.row_slice(row, end_colx=sheet_xls.row_len(row))
                ))
    
            for rowx in range(sheet_xls.nrows):
                if sheet_xls.rowinfo_map[rowx].hidden != 0:
                    print sheet_names[sheet_index], rowx
                    sheet_xlsx.row_dimensions[rowx+1].hidden = True
            for coly in range(sheet_xls.ncols):
                if sheet_xls.colinfo_map[coly].hidden != 0:
                    print sheet_names[sheet_index], coly
                    coly_letter = get_column_letter(coly+1)
                    sheet_xlsx.column_dimensions[coly_letter].hidden = True
    
        return book_xlsx
    

提交回复
热议问题