pandas-groupby | 易学教程

group by a dataframe by values that are just less than a second off - pandas

阅读更多关于 group by a dataframe by values that are just less than a second off - pandas

问题 Let's say i have a pandas dataframe as below: >>> df=pd.DataFrame({'dt':pd.to_datetime(['2018-12-10 16:35:34.246','2018-12-10 16:36:34.243','2018-12-10 16:38:34.216','2018-12-10 16:42:34.123']),'value':[1,2,3,4]}) >>> df dt value 0 2018-12-10 16:35:34.246 1 1 2018-12-10 16:36:34.243 2 2 2018-12-10 16:38:34.216 3 3 2018-12-10 16:42:34.123 4 >>> I would like to group this dataframe by 'dt' column, but i want to group it in a way that it thinks the values that are less than a second different

Pandas pivot table subtotals with multi-index

阅读更多关于 Pandas pivot table subtotals with multi-index

问题 I'm trying to create a simple pivot table with subtotals, excel-style, however I can't find a method using Pandas. I've tried the solution Wes suggested in another subtotal-related question, however that doesn't give the expected results. Below the steps to reproduce it: Create the sample data: sample_data = {'customer': ['A', 'A', 'A', 'B', 'B', 'B', 'A', 'A', 'A', 'B', 'B', 'B'], 'product': ['astro','ball','car','astro','ball', 'car', 'astro', 'ball', 'car','astro','ball','car'], 'week': [1

Pandas pivot table subtotals with multi-index

阅读更多关于 Pandas pivot table subtotals with multi-index

Pandas pivot table subtotals with multi-index

阅读更多关于 Pandas pivot table subtotals with multi-index

How to shift entire groups in pandas groupby

阅读更多关于 How to shift entire groups in pandas groupby

问题 Given the following data: data = {'a' : [1,1,1,8,8,3,3,3,3,4,4] } df = pd.DataFrame(data) I would now like to shift the whole thing down by n groups , so that their current order is preserved. The desired output for a shift of n=1 would be: desired_output = {'a': [NaN,NaN,NaN,1,1,8,8,8,8,3,3] } desired_output_df = pd.DataFrame(desired_output) a shift of n=2 should be: desired_output = {'a': [NaN,NaN,NaN,NaN,NaN,1,1,1,1,8,8] } desired_output_df = pd.DataFrame(desired_output) I have been

How to shift entire groups in pandas groupby

阅读更多关于 How to shift entire groups in pandas groupby

Normalize a column of dataframe using min max normalization based on groupby of another column

阅读更多关于 Normalize a column of dataframe using min max normalization based on groupby of another column

问题 The dataframe is as shown Name Job Salary john painter 40000 peter engineer 50000 sam plumber 30000 john doctor 500000 john driver 20000 sam carpenter 10000 peter scientist 100000 How can i groupby the column Name and apply normalization for the Salary column on each group? Expected result: Name Job Salary john painter 0.041666 peter engineer 0 sam plumber 1 john doctor 1 john driver 0 sam carpenter 0 peter scientist 1 I have tried the following data = df.groupby('Name').transform(lambda x:

How to keep major-order when copying or groupby-ing a pandas DataFrame?

阅读更多关于 How to keep major-order when copying or groupby-ing a pandas DataFrame?

问题 How can I use or manipulate (monkey-patch) pandas in order, to keep always the same major-order on the resulting object for copy and groupby aggregations? I use pandas.DataFrame as datastructure within a business application (risk model) and need fast aggregation of multidimensional data. Aggregation with pandas depends crucially on the major-ordering scheme in use on the underlying numpy array. Unfortunatly, pandas (version 0.23.4) changes the major-order of the underlying numpy array when I

Efficient way of aggregating previous(in time) rows

阅读更多关于 Efficient way of aggregating previous(in time) rows

问题 I have the following dataframe of orders placed by different customers, at different times: rng = list(pd.date_range('2019-02-24', periods=5, freq='T')) + list(pd.date_range('2019-03- 13', periods=2, freq='T')) + list(pd.date_range('2019-02-27', periods=1, freq='T')) customers = ["12987"]*5 + ["89563"]*2 + ["56733"] articles = ["8473", "7631", "1264", "8473", "5641", "9813", "7631", "1132"] order_history = pd.DataFrame({'Customer_no': customers, 'Date': rng, 'Article_no': articles}) order

Efficient way of aggregating previous(in time) rows

阅读更多关于 Efficient way of aggregating previous(in time) rows