Piggy backing off my own previous question python pandas: assign control vs. treatment groupings randomly based on %
Thanks to @maxU, I know how to assign random co
In [13]: df
Out[13]:
customer_id Group
0 ABC 1
1 CDE 3
2 BHF 2
3 NID 1
4 WKL 3
5 SDI 2
6 JSK 1
7 OSM 3
8 MPA 2
9 MAD 1
In [14]: d = {1:[.5,.5], 2:[.4,.6], 3:[.2,.8]}
In [15]: df['Flag'] = \
...: df.groupby('Group')['customer_id'] \
...: .transform(lambda x: np.random.choice(['Control','Test'], len(x), p=d[x.name]))
...:
In [16]: df
Out[16]:
customer_id Group Flag
0 ABC 1 Control
1 CDE 3 Test
2 BHF 2 Test
3 NID 1 Control
4 WKL 3 Control
5 SDI 2 Test
6 JSK 1 Test
7 OSM 3 Test
8 MPA 2 Control
9 MAD 1 Test