Is there a way to auto-adjust Excel column widths with pandas.ExcelWriter?

情到浓时终转凉″ 提交于 2019-11-27 09:19:38

问题


I am being asked to generate some Excel reports. I am currently using pandas quite heavily for my data, so naturally I would like to use the pandas.ExcelWriter method to generate these reports. However the fixed column widths are a problem.

The code I have so far is simple enough. Say I have a dataframe called 'df':

writer = pd.ExcelWriter(excel_file_path)
df.to_excel(writer, sheet_name="Summary")

I was looking over the pandas code, and I don't really see any options to set column widths. Is there a trick out there in the universe to make it such that the columns auto-adjust to the data? Or is there something I can do after the fact to the xlsx file to adjust the column widths?

(I am using the OpenPyXL library, and generating .xlsx files - if that makes any difference.)

Thank you.


回答1:


Inspired by user6178746's answer, I have the following:

# Given a dict of dataframes, for example:
# dfs = {'gadgets': df_gadgets, 'widgets': df_widgets}

writer = pd.ExcelWriter(filename, engine='xlsxwriter')
for sheetname, df in dfs.items():  # loop through `dict` of dataframes
    df.to_excel(writer, sheet_name=sheetname)  # send df to writer
    worksheet = writer.sheets[sheetname]  # pull worksheet object
    for idx, col in enumerate(df):  # loop through all columns
        series = df[col]
        max_len = max((
            series.astype(str).map(len).max(),  # len of largest item
            len(str(series.name))  # len of column name/header
            )) + 1  # adding a little extra space
        worksheet.set_column(idx, idx, max_len)  # set column width
writer.save()



回答2:


I'm posting this because I just ran into the same issue and found that the official documentation for Xlsxwriter and pandas still have this functionality listed as unsupported. I hacked together a solution that solved the issue i was having. I basically just iterate through each column and use worksheet.set_column to set the column width == the max length of the contents of that column.

One important note, however. This solution does not fit the column headers, simply the column values. That should be an easy change though if you need to fit the headers instead. Hope this helps someone :)

import pandas as pd
import sqlalchemy as sa
import urllib


read_server = 'serverName'
read_database = 'databaseName'

read_params = urllib.quote_plus("DRIVER={SQL Server};SERVER="+read_server+";DATABASE="+read_database+";TRUSTED_CONNECTION=Yes")
read_engine = sa.create_engine("mssql+pyodbc:///?odbc_connect=%s" % read_params)

#Output some SQL Server data into a dataframe
my_sql_query = """ SELECT * FROM dbo.my_table """
my_dataframe = pd.read_sql_query(my_sql_query,con=read_engine)

#Set destination directory to save excel.
xlsFilepath = r'H:\my_project' + "\\" + 'my_file_name.xlsx'
writer = pd.ExcelWriter(xlsFilepath, engine='xlsxwriter')

#Write excel to file using pandas to_excel
my_dataframe.to_excel(writer, startrow = 1, sheet_name='Sheet1', index=False)

#Indicate workbook and worksheet for formatting
workbook = writer.book
worksheet = writer.sheets['Sheet1']

#Iterate through each column and set the width == the max length in that column. A padding length of 2 is also added.
for i, col in enumerate(my_dataframe.columns):
    # find length of column i
    column_len = my_dataframe[col].astype(str).str.len().max()
    # Setting the length if the column header is larger
    # than the max column value length
    column_len = max(column_len, len(col)) + 2
    # set the column length
    worksheet.set_column(i, i, column_len)
writer.save()



回答3:


There is probably no automatic way to do it right now, but as you use openpyxl, the following line (adapted from another answer by user Bufke on how to do in manually) allows you to specify a sane value (in character widths):

writer.sheets['Summary'].column_dimensions['A'].width = 15



回答4:


There is a nice package that I started to use recently called StyleFrame.

it gets DataFrame and lets you to style it very easily...

by default the columns width is auto-adjusting.

for example:

from StyleFrame import StyleFrame
import pandas as pd

df = pd.DataFrame({'aaaaaaaaaaa': [1, 2, 3], 
                   'bbbbbbbbb': [1, 1, 1],
                   'ccccccccccc': [2, 3, 4]})
excel_writer = StyleFrame.ExcelWriter('example.xlsx')
sf = StyleFrame(df)
sf.to_excel(excel_writer=excel_writer, row_to_add_filters=0,
            columns_and_rows_to_freeze='B2')
excel_writer.save()

you can also change the columns width:

sf.set_column_width(columns=['aaaaaaaaaaa', 'bbbbbbbbb'],
                    width=35.3)


UPDATE

In version 1.4 best_fit argument was added to StyleFrame.to_excel. See the documentation.




回答5:


By using pandas and xlsxwriter you can do your task, below code will perfectly work in Python 3.x. For more details on working with XlsxWriter with pandas this link might be useful https://xlsxwriter.readthedocs.io/working_with_pandas.html

import pandas as pd
writer = pd.ExcelWriter(excel_file_path, engine='xlsxwriter')
df.to_excel(writer, sheet_name="Summary")
workbook = writer.book
worksheet = writer.sheets["Summary"]
#set the column width as per your requirement
worksheet.set_column('A:A', 25)
writer.save()



回答6:


import re
import openpyxl
..
for col in _ws.columns:
    max_lenght = 0
    print(col[0])
    col_name = re.findall('\w\d', str(col[0]))
    col_name = col_name[0]
    col_name = re.findall('\w', str(col_name))[0]
    print(col_name)
    for cell in col:
        try:
            if len(str(cell.value)) > max_lenght:
                max_lenght = len(cell.value)
        except:
            pass
    adjusted_width = (max_lenght+2)
    _ws.column_dimensions[col_name].width = adjusted_width



回答7:


I found that it was more useful to adjust the column with based on the column header rather than column content.

Using df.columns.values.tolist() I generate a list of the column headers and use the lengths of these headers to determine the width of the columns.

See full code below:

import pandas as pd
import xlsxwriter

writer = pd.ExcelWriter(filename, engine='xlsxwriter')
df.to_excel(writer, index=False, sheet_name=sheetname)

workbook = writer.book # Access the workbook
worksheet= writer.sheets[sheetname] # Access the Worksheet

header_list = df.columns.values.tolist() # Generate list of headers
for i in range(0, len(header_list)):
    worksheet.set_column(i, i, len(header_list[i])) # Set column widths based on len(header)

writer.save() # Save the excel file



回答8:


At work, I am always writing the dataframes to excel files. So instead of writing the same code over and over, I have created a modulus. Now I just import it and use it to write and formate the excel files. There is one downside though, it takes a long time if the dataframe is extra large. So here is the code:

def result_to_excel(output_name, dataframes_list, sheet_names_list, output_dir):
    out_path = os.path.join(output_dir, output_name)
    writerReport = pd.ExcelWriter(out_path, engine='xlsxwriter',
                    datetime_format='yyyymmdd', date_format='yyyymmdd')
    workbook = writerReport.book
    # loop through the list of dataframes to save every dataframe into a new sheet in the excel file
    for index, dataframe in enumerate(dataframes_list):
        sheet_name = sheet_names_list[index]  # choose the sheet name from sheet_names_list
        dataframe.to_excel(writerReport, sheet_name=sheet_name, index=False, startrow=0)
        # Add a header format.
        format = workbook.add_format({
            'bold': True,
            'border': 1,
            'fg_color': '#0000FF',
            'font_color': 'white'})
        # Write the column headers with the defined format.
        worksheet = writerReport.sheets[sheet_name]
        for col_num, col_name in enumerate(dataframe.columns.values):
            worksheet.write(0, col_num, col_name, format)
        worksheet.autofilter(0, 0, 0, len(dataframe.columns) - 1)
        worksheet.freeze_panes(1, 0)
        # loop through the columns in the dataframe to get the width of the column
        for index, col in enumerate(dataframe.columns):
            max_width = max([len(str(s)) for s in dataframe[col].values] + [len(col) + 2])
            # define a max width to not get to wide column
            if max_width > 50:
                max_width = 50
            worksheet.set_column(index, index, max_width)
    writerReport.save()
    writerReport.close()
    return output_dir + output_name




回答9:


Easiest solution is to specify width of column in set_column method.

    for worksheet in writer.sheets.values():
        worksheet.set_column(0,last_column_value, required_width_constant)



回答10:


Combining the other answers and comments and also supporting multi-indices:

def autosize_excel_columns(worksheet, df):
  autosize_excel_columns_df(worksheet, df.index.to_frame())
  autosize_excel_columns_df(worksheet, df, offset=df.index.nlevels)

def autosize_excel_columns_df(worksheet, df, offset=0):
  for idx, col in enumerate(df):
    series = df[col]
    max_len = max((
      series.astype(str).map(len).max(),
      len(str(series.name))
    )) + 1
    worksheet.set_column(idx+offset, idx+offset, max_len)

sheetname=...
df.to_excel(writer, sheet_name=sheetname, freeze_panes=(df.columns.nlevels, df.index.nlevels))
worksheet = writer.sheets[sheetname]
autosize_excel_columns(worksheet, df)
writer.save()


来源:https://stackoverflow.com/questions/17326973/is-there-a-way-to-auto-adjust-excel-column-widths-with-pandas-excelwriter

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!