how to convert xls to xlsx

前端 未结 14 1297
生来不讨喜
生来不讨喜 2020-11-27 03:58

I have some *.xls(excel 2003) files, and I want to convert those files into xlsx(excel 2007).

I use the uno python package, when I save the documents, I can set the

相关标签:
14条回答
  • 2020-11-27 04:33

    I tried @Jhon Anderson's solution, works well but got an "year is out of range" error when there are cells of time format like HH:mm:ss without date. There for I improved the algorithm again:

    def xls_to_xlsx(*args, **kw):
    """
        open and convert an XLS file to openpyxl.workbook.Workbook
        ----------
        @param args: args for xlrd.open_workbook
        @param kw: kwargs for xlrd.open_workbook
        @return: openpyxl.workbook.Workbook对象
        """
        book_xls = xlrd.open_workbook(*args, formatting_info=True, ragged_rows=True, **kw)
        book_xlsx = openpyxl.workbook.Workbook()
    
        sheet_names = book_xls.sheet_names()
        for sheet_index in range(len(sheet_names)):
            sheet_xls = book_xls.sheet_by_name(sheet_names[sheet_index])
            if sheet_index == 0:
                sheet_xlsx = book_xlsx.active
                sheet_xlsx.title = sheet_names[sheet_index]
            else:
                sheet_xlsx = book_xlsx.create_sheet(title=sheet_names[sheet_index])
            for crange in sheet_xls.merged_cells:
                rlo, rhi, clo, chi = crange
                sheet_xlsx.merge_cells(start_row=rlo + 1, end_row=rhi,
                start_column=clo + 1, end_column=chi,)
    
            def _get_xlrd_cell_value(cell):
                value = cell.value
                if cell.ctype == xlrd.XL_CELL_DATE:
                    datetime_tup = xlrd.xldate_as_tuple(value,0)    
                    if datetime_tup[0:3] == (0, 0, 0):   # time format without date
                        value = datetime.time(*datetime_tup[3:])
                    else:
                        value = datetime.datetime(*datetime_tup)
                return value
    
            for row in range(sheet_xls.nrows):
                sheet_xlsx.append((
                    _get_xlrd_cell_value(cell)
                    for cell in sheet_xls.row_slice(row, end_colx=sheet_xls.row_len(row))
                ))
        return book_xlsx
    

    Then work perfect!

    0 讨论(0)
  • 2020-11-27 04:34

    I've had to do this before. The main idea is to use the xlrd module to open and parse a xls file and write the content to a xlsx file using the openpyxl module.

    Here's my code. Attention! It cannot handle complex xls files, you should add you own parsing logic if you are going to use it.

    import xlrd
    from openpyxl.workbook import Workbook
    from openpyxl.reader.excel import load_workbook, InvalidFileException
    
    def open_xls_as_xlsx(filename):
        # first open using xlrd
        book = xlrd.open_workbook(filename)
        index = 0
        nrows, ncols = 0, 0
        while nrows * ncols == 0:
            sheet = book.sheet_by_index(index)
            nrows = sheet.nrows
            ncols = sheet.ncols
            index += 1
    
        # prepare a xlsx sheet
        book1 = Workbook()
        sheet1 = book1.get_active_sheet()
    
        for row in xrange(0, nrows):
            for col in xrange(0, ncols):
                sheet1.cell(row=row, column=col).value = sheet.cell_value(row, col)
    
        return book1
    
    0 讨论(0)
  • 2020-11-27 04:36

    You need to have win32com installed on your machine. Here is my code:

    import win32com.client as win32
    fname = "full+path+to+xls_file"
    excel = win32.gencache.EnsureDispatch('Excel.Application')
    wb = excel.Workbooks.Open(fname)
    
    wb.SaveAs(fname+"x", FileFormat = 51)    #FileFormat = 51 is for .xlsx extension
    wb.Close()                               #FileFormat = 56 is for .xls extension
    excel.Application.Quit()
    
    0 讨论(0)
  • 2020-11-27 04:39

    Tried @Jhon's solution 1st, then I turned into pyexcel as a solution

    pyexcel.save_as(file_name=oldfilename, dest_file_name=newfilename)
    

    It works properly until I tried to package my project to a single exe file by PyInstaller, I tried all hidden imports option, following error still there:

      File "utils.py", line 27, in __enter__
        pyexcel.save_as(file_name=self.filename, dest_file_name=newfilename)
      File "site-packages\pyexcel\core.py", line 77, in save_as
      File "site-packages\pyexcel\internal\core.py", line 22, in get_sheet_stream
      File "site-packages\pyexcel\plugins\sources\file_input.py", line 39, in get_da
    ta
      File "site-packages\pyexcel\plugins\parsers\excel.py", line 19, in parse_file
      File "site-packages\pyexcel\plugins\parsers\excel.py", line 40, in _parse_any
      File "site-packages\pyexcel_io\io.py", line 73, in get_data
      File "site-packages\pyexcel_io\io.py", line 91, in _get_data
      File "site-packages\pyexcel_io\io.py", line 188, in load_data
      File "site-packages\pyexcel_io\plugins.py", line 90, in get_a_plugin
      File "site-packages\lml\plugin.py", line 290, in load_me_now
      File "site-packages\pyexcel_io\plugins.py", line 107, in raise_exception
    pyexcel_io.exceptions.SupportingPluginAvailableButNotInstalled: Please install p
    yexcel-xls
    [3192] Failed to execute script
    

    Then, I jumped to pandas:

    pd.read_excel(oldfilename).to_excel(newfilename, sheet_name=self.sheetname,index=False)
    

    Update @ 21-Feb 2020

    openpyxl provides the function: append

    enable the ability to insert rows to a xlxs file which means user could read the data from a xls file and insert them into a xlsx file.

    • append([‘This is A1’, ‘This is B1’, ‘This is C1’])
    • or append({‘A’ : ‘This is A1’, ‘C’ : ‘This is C1’})
    • or append({1 : ‘This is A1’, 3 : ‘This is C1’})

    Appends a group of values at the bottom of the current sheet:

    • If it’s a list: all values are added in order, starting from the first column
    • If it’s a dict: values are assigned to the columns indicated by the keys (numbers or letters)
    0 讨论(0)
  • 2020-11-27 04:40

    CONVERT XLS FILE TO XLSX

    Using python3.6 I have just come accross the same issue and after hours of struggle I solved it by doing the ff, you probably wont need all of the packages: (I will be as clear as posslbe)

    make sure to install the following packages before proceeding

    pip install pyexcel, pip install pyexcel-xls, pip install pyexcel-xlsx,

    pip install pyexcel-cli

    step 1:

    import pyexcel
    

    step 2: "example.xls","example.xlsx","example.xlsm"

    sheet0 = pyexcel.get_sheet(file_name="your_file_path.xls", name_columns_by_row=0)
    

    step3: create array from contents

    xlsarray = sheet.to_array() 
    

    step4: check variable contents to verify

    xlsarray
    

    step5: pass the array held in variable called (xlsarray) to a new workbook variable called(sheet1)

    sheet1 = pyexcel.Sheet(xlsarray)
    

    step6: save the new sheet ending with .xlsx (in my case i want xlsx)

    sheet1.save_as("test.xlsx")
    
    0 讨论(0)
  • 2020-11-27 04:40

    Well I kept it simple and tried with Pandas:

    import pandas as pd
    
    df = pd.read_excel (r'Path_of_your_file\\name_of_your_file.xls')
    
    df.to_excel(r'Output_path\\new_file_name.xlsx', index = False)
    
    0 讨论(0)
提交回复
热议问题