How is a Pandas crosstab different from a Pandas pivot_table?

。_饼干妹妹 提交于 2019-12-18 04:34:33

问题


Both the pandas.crosstab and the Pandas pivot table seem to provide the exact same functionality. Are there any differences?


回答1:


The main difference between the two is the pivot_table expects your input data to already be a DataFrame; you pass a DataFrame to pivot_table and specify the index/columns/values by passing the column names as strings. With cross_tab, you don't necessarily need to have a DataFrame going in, as you just pass array-like objects for index/columns/values.

Looking at the source code for crosstab, it essentially takes the array-like objects you pass, creates a DataFrame, then calls pivot_table as appropriate.

In general, use pivot_table if you already have a DataFrame, so you don't have the additional overhead of creating the same DataFrame again. If you're starting from array-like objects and are only concerned with the pivoted data, use crosstab. In most cases, I don't think it will really make a difference which function you decide to use.




回答2:


Is it the same, if in pivot_table use aggfunc=len and fill_value=0:

pd.crosstab(df['Col X'], df['Col Y'])
pd.pivot_table(df, index=['Col X'], columns=['Col Y'], aggfunc=len, fill_value=0)

EDIT: There is more difference:

Default aggfunc are different: pivot_table - np.mean, crosstab - len.

Parameter margins_name is only in pivot_table.

In pivot_table you can use Grouper for index and columns keywords.


I think if you need simply frequency table, crosstab function is better.




回答3:


The pivot_table does not have the normalize argument, unfortunately.

In crosstab, the normalize argument calculates percentages by dividing each cell by the sum of cells, as described below:

  • normalize = 'index' divides each cell by the sum of its row
  • normalize = 'columns' divides each cell by the sum of its column
  • normalize = True divides each cell by the total of all cells in the table


来源:https://stackoverflow.com/questions/36267745/how-is-a-pandas-crosstab-different-from-a-pandas-pivot-table

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!