Export Pandas DataFrame into a PDF file using Python

后端 未结 4 1454
既然无缘
既然无缘 2020-12-01 04:42

What is an efficient way to generate PDF for data frames in Pandas?

4条回答
  •  既然无缘
    2020-12-01 05:16

    This is a solution with an intermediate pdf file.

    The table is pretty printed with some minimal css.

    The pdf conversion is done with weasyprint. You need to pip install weasyprint.

    # Create a pandas dataframe with demo data:
    import pandas as pd
    demodata_csv = 'https://raw.githubusercontent.com/mwaskom/seaborn-data/master/iris.csv'
    df = pd.read_csv(demodata_csv)
    
    # Pretty print the dataframe as an html table to a file
    intermediate_html = '/tmp/intermediate.html'
    to_html_pretty(df,intermediate_html,'Iris Data')
    # if you do not want pretty printing, just use pandas:
    # df.to_html(intermediate_html)
    
    # Convert the html file to a pdf file using weasyprint
    import weasyprint
    out_pdf= '/tmp/demo.pdf'
    weasyprint.HTML(intermediate_html).write_pdf(out_pdf)
    
    # This is the table pretty printer used above:
    
    def to_html_pretty(df, filename='/tmp/out.html', title=''):
        '''
        Write an entire dataframe to an HTML file
        with nice formatting.
        Thanks to @stackoverflowuser2010 for the
        pretty printer see https://stackoverflow.com/a/47723330/362951
        '''
        ht = ''
        if title != '':
            ht += '

    %s

    \n' % title ht += df.to_html(classes='wide', escape=False) with open(filename, 'w') as f: f.write(HTML_TEMPLATE1 + ht + HTML_TEMPLATE2) HTML_TEMPLATE1 = ''' ''' HTML_TEMPLATE2 = ''' '''

    Thanks to @stackoverflowuser2010 for the pretty printer, see stackoverflowuser2010's answer https://stackoverflow.com/a/47723330/362951

    I did not use pdfkit, because I had some problems with it on a headless machine. But weasyprint is great.

提交回复
热议问题