问题
I'm trying to export a multi-index Pandas DataFrame to latex using ipython's nbconvert but the multi-index rows come out all wrong. I'm using the following code at the beginning of the code to convert to latex properly (I found it somewhere on SO but can't remember where):
from sympy import latex
from IPython.display import HTML, Latex, display, Math
pd.set_option('display.notebook_repr_html', True)
def _repr_latex_(self):
return "\\begin{center} %s \end{center}" % self.to_latex()
pd.DataFrame._repr_latex_ = _repr_latex_ # monkey patch pandas DataFrame
the groupby code is pretty large but I have tested it also with smaller codes like:
a = np.array([[1, 3, 4, 5],
[1, 5, 36, 2],
[3, 6, 23, 5],
[2, 2, 1, 6],
[2, 5, 1, 99]])
df = pd.DataFrame(a, columns=['A','B','C','D'])
df.groupby(by=['A','D']).sum()
The result of this is
\begin{center} \begin{tabular}{lrr}
\toprule
{} & B & C \\
A D & & \\
\midrule
1 2 & 5 & 36 \\
5 & 3 & 4 \\
2 6 & 2 & 1 \\
99 & 5 & 1 \\
3 5 & 6 & 23 \\
\bottomrule
\end{tabular}
\end{center}
This example shows only the first of the problems, this output will show the multiindex stacked one on top of the other, but I can't find a way of formatting it before output. (I am producing many large tables of this sort so formating on latex itself would [and is] a pain). also with a couple of multi-index more, it gets totally unreadable. The second big problem is that Ipython renders this tables with display() really nicely adjusting column width to screen, but on latex it exceeds the page width and most of the table is lost.
I have searched all over for a better formating solution for nbconvert but couldn't find a thing. Please if you have had this problem also or you know a solution to any of this two problems please tell me.
pd: I'm using python 2.7.7 Anaconda 2.0.1 (64-bit) and the latest versions of pandas(0.14.1) and ipython(2.2.0).
回答1:
I think this is a bug in to_latex
, and the result of res.T.to_latex()
doesn't look right either.
A workaround might be to modify the index:
In [11]: res = df.groupby(by=['A','D']).sum()
In [12]: res.index = res.index.map(lambda x: ' & '.join(map(str, x)))
In [13]: res.index.name = 'A & D'
In [14]: res.columns.values[0] = ' & ' + res.columns[0]
In [15]: print res.to_latex(escape=False) # the whole point is not to escape the &s
\begin{tabular}{lrr}
\toprule
{} & & B & C \\
\midrule
A & D & & \\
1 & 2 & 5 & 36 \\
1 & 5 & 3 & 4 \\
2 & 6 & 2 & 1 \\
2 & 99 & 5 & 1 \\
3 & 5 & 6 & 23 \\
\bottomrule
\end{tabular}
回答2:
Strange. I tried something similar with .to_html () tonight, only to find that the output displayed the html rather than rendering it. It looks to me very similar to your result.
FWIW. Using IPython 2.2, on a Mac, with anaconda modules.
来源:https://stackoverflow.com/questions/25734454/nbconvert-multiindex-dataframes-to-latex