pandas-groupby | 易学教程

python: cumulative concatenate in pandas dataframe

阅读更多关于 python: cumulative concatenate in pandas dataframe

问题 How to do a cumulative concatenate in pandas dataframe? I found there are a number of solutions in R, but can't find it in python. Here is the problem: suppose we have a dataframe: with columns: date and name : import pandas as pd d = {'date': [1,1,2,2,3,3,3,4,4,4], 'name':['A','B','A','C','A','B','B','A','B','C']} df = pd.DataFrame(data=d) I want to get CUM_CONCAT , which is a cumulative concatenate groupby date: date name CUM_CONCAT 0 1 A [A] 1 1 B [A,B] 2 2 A [A] 3 2 C [A,C] 4 3 A [A] 5 3

python: cumulative concatenate in pandas dataframe

阅读更多关于 python: cumulative concatenate in pandas dataframe

What is the difference between bins when using groupby apply vs resample apply?

阅读更多关于 What is the difference between bins when using groupby apply vs resample apply?

问题 This is somewhat of a broad topic, but I will try to pare it to some specific questions. I have noticed a difference between resample and groupby that I am curious to learn about. Here is some hourly time series data: In[]: import pandas as pd dr = pd.date_range('01-01-2020 8:00', periods=10, freq='H') df = pd.DataFrame({'A':range(10), 'B':range(10,20), 'C':range(20,30)}, index=dr) df Out[]: A B C 2020-01-01 08:00:00 0 10 20 2020-01-01 09:00:00 1 11 21 2020-01-01 10:00:00 2 12 22 2020-01-01

What is the difference between bins when using groupby apply vs resample apply?

阅读更多关于 What is the difference between bins when using groupby apply vs resample apply?

What is the difference between bins when using groupby apply vs resample apply?

阅读更多关于 What is the difference between bins when using groupby apply vs resample apply?

multiple merge operations on two dataframes using pandas

阅读更多关于 multiple merge operations on two dataframes using pandas

问题 I have two dataframes where multiple operations are to be implemented, for example: old_DF id col1 col2 col3 ------------------------- 1 aaa 2 bbb 123 new_DF id col1 col2 col3 ------------------------- 1 xxx 999 2 xxx kkk The following operations need to be performed on these dataframes: Merging the two dataframes Replacing only the blanks (NAs) cells in the old_DF with corresponding values from new_DF Cells from both the dataframes where the values are contradicting should be reported in a

multiple merge operations on two dataframes using pandas

阅读更多关于 multiple merge operations on two dataframes using pandas

Aggregate DataFrame base on list values

阅读更多关于 Aggregate DataFrame base on list values

问题 I have the next problem. I have a list with string values: a = ['word1', 'word2', 'word3', 'word4', ..., 'wordN'] And I have the dataframe with values: +--------------+----------+-----------+ | keywords | impressions | clicks | +--------------+----------+-----------+ | word1 | 1245523 | 12321231 | +--------------+----------+-----------+ | word2 | 4212321 | 12312312 | +--------------+----------+-----------+ ........................................ Please advice me on how to create a specific,

Calculate percentage on DataFrame

阅读更多关于 Calculate percentage on DataFrame

问题 I'm trying to calculate the percentage of each crime of the following Dataframe: Violent Murder Larceny_Theft Vehicle_Theft Year 1960 288460 3095700 1855400 328200 1961 289390 3198600 1913000 336000 1962 301510 3450700 2089600 366800 1963 316970 3792500 2297800 408300 1964 364220 4200400 2514400 472800 So I should calculate first the total of crimes per year and then use that to calculate the percentage of each crime. I was trying the following: > perc = (crime *100) / crime.sum(axis=1) Any

Calculate percentage on DataFrame

阅读更多关于 Calculate percentage on DataFrame