Consider a Series with the following percentiles:
> df[\'col_1\'].describe(percentiles=np.linspace(0, 1, 20))
count 13859.000000
mean 421.77
df2 = pd.DataFrame(range(1000))
df2.columns = ['a1']
df2['percentile'] = pd.qcut(df2.a1,100, labels=False)
Or leave out labels to see the range
Note that in Python 3, with Pandas 0.16.2 (latest version as of today), you need to use list(range(1000))
instead of range(1000)
for the above to work.