Is there a pythonic way to do a contingency table in Pandas?

后端 未结 4 618
傲寒
傲寒 2020-12-30 00:09

Given a dataframe that looks like this:

            A   B      
2005-09-06  5  -2  
2005-09-07 -1   3  
2005-09-08  4   5 
2005-09-09 -8   2
2005-09-10 -2  -         


        
4条回答
  •  -上瘾入骨i
    2020-12-30 00:46

    Probably easiest to just use the pandas function crosstab. Borrowing from Dyno Fu above:

    import pandas as pd
    from StringIO import StringIO
    table = """dt          A   B
    2005-09-06  5  -2
    2005-09-07 -1   3
    2005-09-08  4   5
    2005-09-09 -8   2
    2005-09-10 -2  -5
    2005-09-11 -7   9
    2005-09-12  2   8
    2005-09-13  6  -5
    2005-09-14  6  -5
    """
    sio = StringIO(table)
    df = pd.read_table(sio, sep=r"\s+", parse_dates=['dt'])
    df.set_index("dt", inplace=True)
    
    pd.crosstab(df.A > 0, df.B > 0)
    

    Output:

    B      False  True 
    A                  
    False      1      3
    True       3      2
    
    [2 rows x 2 columns]
    

    Also the table is usable if you want to do a Fisher exact test with scipy.stats etc:

    from scipy.stats import fisher_exact
    tab = pd.crosstab(df.A > 0, df.B > 0)
    fisher_exact(tab)
    

提交回复
热议问题