pandas-groupby

group by a dataframe by values that are just less than a second off - pandas

£可爱£侵袭症+ 提交于 2021-01-28 18:30:47
问题 Let's say i have a pandas dataframe as below: >>> df=pd.DataFrame({'dt':pd.to_datetime(['2018-12-10 16:35:34.246','2018-12-10 16:36:34.243','2018-12-10 16:38:34.216','2018-12-10 16:42:34.123']),'value':[1,2,3,4]}) >>> df dt value 0 2018-12-10 16:35:34.246 1 1 2018-12-10 16:36:34.243 2 2 2018-12-10 16:38:34.216 3 3 2018-12-10 16:42:34.123 4 >>> I would like to group this dataframe by 'dt' column, but i want to group it in a way that it thinks the values that are less than a second different

Pandas pivot table subtotals with multi-index

我的梦境 提交于 2021-01-28 16:50:32
问题 I'm trying to create a simple pivot table with subtotals, excel-style, however I can't find a method using Pandas. I've tried the solution Wes suggested in another subtotal-related question, however that doesn't give the expected results. Below the steps to reproduce it: Create the sample data: sample_data = {'customer': ['A', 'A', 'A', 'B', 'B', 'B', 'A', 'A', 'A', 'B', 'B', 'B'], 'product': ['astro','ball','car','astro','ball', 'car', 'astro', 'ball', 'car','astro','ball','car'], 'week': [1

Pandas pivot table subtotals with multi-index

扶醉桌前 提交于 2021-01-28 16:47:48
问题 I'm trying to create a simple pivot table with subtotals, excel-style, however I can't find a method using Pandas. I've tried the solution Wes suggested in another subtotal-related question, however that doesn't give the expected results. Below the steps to reproduce it: Create the sample data: sample_data = {'customer': ['A', 'A', 'A', 'B', 'B', 'B', 'A', 'A', 'A', 'B', 'B', 'B'], 'product': ['astro','ball','car','astro','ball', 'car', 'astro', 'ball', 'car','astro','ball','car'], 'week': [1

Pandas pivot table subtotals with multi-index

房东的猫 提交于 2021-01-28 16:47:35
问题 I'm trying to create a simple pivot table with subtotals, excel-style, however I can't find a method using Pandas. I've tried the solution Wes suggested in another subtotal-related question, however that doesn't give the expected results. Below the steps to reproduce it: Create the sample data: sample_data = {'customer': ['A', 'A', 'A', 'B', 'B', 'B', 'A', 'A', 'A', 'B', 'B', 'B'], 'product': ['astro','ball','car','astro','ball', 'car', 'astro', 'ball', 'car','astro','ball','car'], 'week': [1

How to shift entire groups in pandas groupby

时光怂恿深爱的人放手 提交于 2021-01-28 10:32:30
问题 Given the following data: data = {'a' : [1,1,1,8,8,3,3,3,3,4,4] } df = pd.DataFrame(data) I would now like to shift the whole thing down by n groups , so that their current order is preserved. The desired output for a shift of n=1 would be: desired_output = {'a': [NaN,NaN,NaN,1,1,8,8,8,8,3,3] } desired_output_df = pd.DataFrame(desired_output) a shift of n=2 should be: desired_output = {'a': [NaN,NaN,NaN,NaN,NaN,1,1,1,1,8,8] } desired_output_df = pd.DataFrame(desired_output) I have been

How to shift entire groups in pandas groupby

試著忘記壹切 提交于 2021-01-28 10:22:39
问题 Given the following data: data = {'a' : [1,1,1,8,8,3,3,3,3,4,4] } df = pd.DataFrame(data) I would now like to shift the whole thing down by n groups , so that their current order is preserved. The desired output for a shift of n=1 would be: desired_output = {'a': [NaN,NaN,NaN,1,1,8,8,8,8,3,3] } desired_output_df = pd.DataFrame(desired_output) a shift of n=2 should be: desired_output = {'a': [NaN,NaN,NaN,NaN,NaN,1,1,1,1,8,8] } desired_output_df = pd.DataFrame(desired_output) I have been

Normalize a column of dataframe using min max normalization based on groupby of another column

非 Y 不嫁゛ 提交于 2021-01-28 08:09:44
问题 The dataframe is as shown Name Job Salary john painter 40000 peter engineer 50000 sam plumber 30000 john doctor 500000 john driver 20000 sam carpenter 10000 peter scientist 100000 How can i groupby the column Name and apply normalization for the Salary column on each group? Expected result: Name Job Salary john painter 0.041666 peter engineer 0 sam plumber 1 john doctor 1 john driver 0 sam carpenter 0 peter scientist 1 I have tried the following data = df.groupby('Name').transform(lambda x:

How to keep major-order when copying or groupby-ing a pandas DataFrame?

寵の児 提交于 2021-01-28 07:02:56
问题 How can I use or manipulate (monkey-patch) pandas in order, to keep always the same major-order on the resulting object for copy and groupby aggregations? I use pandas.DataFrame as datastructure within a business application (risk model) and need fast aggregation of multidimensional data. Aggregation with pandas depends crucially on the major-ordering scheme in use on the underlying numpy array. Unfortunatly, pandas (version 0.23.4) changes the major-order of the underlying numpy array when I

Efficient way of aggregating previous(in time) rows

╄→尐↘猪︶ㄣ 提交于 2021-01-28 06:02:39
问题 I have the following dataframe of orders placed by different customers, at different times: rng = list(pd.date_range('2019-02-24', periods=5, freq='T')) + list(pd.date_range('2019-03- 13', periods=2, freq='T')) + list(pd.date_range('2019-02-27', periods=1, freq='T')) customers = ["12987"]*5 + ["89563"]*2 + ["56733"] articles = ["8473", "7631", "1264", "8473", "5641", "9813", "7631", "1132"] order_history = pd.DataFrame({'Customer_no': customers, 'Date': rng, 'Article_no': articles}) order

Efficient way of aggregating previous(in time) rows

萝らか妹 提交于 2021-01-28 05:56:12
问题 I have the following dataframe of orders placed by different customers, at different times: rng = list(pd.date_range('2019-02-24', periods=5, freq='T')) + list(pd.date_range('2019-03- 13', periods=2, freq='T')) + list(pd.date_range('2019-02-27', periods=1, freq='T')) customers = ["12987"]*5 + ["89563"]*2 + ["56733"] articles = ["8473", "7631", "1264", "8473", "5641", "9813", "7631", "1132"] order_history = pd.DataFrame({'Customer_no': customers, 'Date': rng, 'Article_no': articles}) order