nbconvert multiindex dataframes to latex

半世苍凉 提交于 2019-12-11 03:43:29

问题


I'm trying to export a multi-index Pandas DataFrame to latex using ipython's nbconvert but the multi-index rows come out all wrong. I'm using the following code at the beginning of the code to convert to latex properly (I found it somewhere on SO but can't remember where):

from sympy import latex
from IPython.display import HTML, Latex, display, Math
pd.set_option('display.notebook_repr_html', True)
def _repr_latex_(self):
    return "\\begin{center} %s \end{center}" % self.to_latex()
pd.DataFrame._repr_latex_ = _repr_latex_  # monkey patch pandas DataFrame

the groupby code is pretty large but I have tested it also with smaller codes like:

a = np.array([[1, 3, 4, 5],
             [1, 5, 36, 2],
             [3, 6, 23, 5],
             [2, 2, 1, 6],
             [2, 5, 1, 99]])
df = pd.DataFrame(a, columns=['A','B','C','D'])
df.groupby(by=['A','D']).sum()

The result of this is

    \begin{center} \begin{tabular}{lrr}
\toprule
{} &  B &   C \\
A D  &    &     \\
\midrule
1 2  &  5 &  36 \\
  5  &  3 &   4 \\
2 6  &  2 &   1 \\
  99 &  5 &   1 \\
3 5  &  6 &  23 \\
\bottomrule
\end{tabular}
 \end{center}

This example shows only the first of the problems, this output will show the multiindex stacked one on top of the other, but I can't find a way of formatting it before output. (I am producing many large tables of this sort so formating on latex itself would [and is] a pain). also with a couple of multi-index more, it gets totally unreadable. The second big problem is that Ipython renders this tables with display() really nicely adjusting column width to screen, but on latex it exceeds the page width and most of the table is lost.

I have searched all over for a better formating solution for nbconvert but couldn't find a thing. Please if you have had this problem also or you know a solution to any of this two problems please tell me.

pd: I'm using python 2.7.7 Anaconda 2.0.1 (64-bit) and the latest versions of pandas(0.14.1) and ipython(2.2.0).


回答1:


I think this is a bug in to_latex, and the result of res.T.to_latex() doesn't look right either.

A workaround might be to modify the index:

In [11]: res = df.groupby(by=['A','D']).sum()

In [12]: res.index = res.index.map(lambda x: ' & '.join(map(str, x)))

In [13]: res.index.name = 'A & D'

In [14]: res.columns.values[0] = ' & ' + res.columns[0]

In [15]: print res.to_latex(escape=False)  # the whole point is not to escape the &s
\begin{tabular}{lrr}
\toprule
{} &   & B &   C \\
\midrule
A & D  &       &     \\
1 & 2  &     5 &  36 \\
1 & 5  &     3 &   4 \\
2 & 6  &     2 &   1 \\
2 & 99 &     5 &   1 \\
3 & 5  &     6 &  23 \\
\bottomrule
\end{tabular}



回答2:


Strange. I tried something similar with .to_html () tonight, only to find that the output displayed the html rather than rendering it. It looks to me very similar to your result.

FWIW. Using IPython 2.2, on a Mac, with anaconda modules.



来源:https://stackoverflow.com/questions/25734454/nbconvert-multiindex-dataframes-to-latex

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!