How to calculate conditional probability of values in dataframe pandas-python?

前端 未结 4 1955
误落风尘
误落风尘 2020-12-13 16:32

I want to calculate conditional probabilites of ratings(\'A\',\'B\',\'C\') in ratings column.

    company     model    rating   type
0   ford       mustang          


        
4条回答
  •  孤街浪徒
    2020-12-13 17:20

    You can use groupby:

    In [2]: df = pd.DataFrame({'company': ['ford', 'chevy', 'ford', 'ford', 'ford', 'toyota'],
                         'model': ['mustang', 'camaro', 'fiesta', 'focus', 'taurus', 'camry'],
                         'rating': ['A', 'B', 'C', 'A', 'B', 'B'],
                         'type': ['coupe', 'coupe', 'sedan', 'sedan', 'sedan', 'sedan']})
    
    In [3]: df.groupby('rating').count()['model'] / len(df)
    Out[3]:
    rating
    A    0.333333
    B    0.500000
    C    0.166667
    Name: model, dtype: float64
    
    In [4]: (df.groupby(['rating', 'type']).count() / df.groupby('rating').count())['model']
    Out[4]:
    rating  type
    A       coupe    0.500000
            sedan    0.500000
    B       coupe    0.333333
            sedan    0.666667
    C       sedan    1.000000
    Name: model, dtype: float64
    

提交回复
热议问题