group-by

how to perform drop_duplicates with multiple condition in a pandas dataframe

纵然是瞬间 提交于 2020-01-05 08:24:38
问题 I have a df, Sr.No Name Class Data 0 1 Sri 1 sri is a good player 1 '' Sri 2 sri is good in cricket 2 '' Sri 3 sri went out 3 2 Ram 1 Ram is a good player 4 '' Ram 2 sri is good in cricket 5 '' Ram 3 Ram went out 6 3 Sri 1 sri is a good player 7 '' Sri 2 sri is good in cricket 8 '' Sri 3 sri went out 9 4 Sri 1 sri is a good player 10 '' Sri 2 sri is good in cricket 11 '' Sri 3 sri went out 12 '' Sri 4 sri came back I am trying to drop duplicates based on ["Name","Class","Data"]. The goal is

Pyspark - GroupBy and Count combined with a WHERE

試著忘記壹切 提交于 2020-01-05 07:51:32
问题 Say I have a list of magazine subscriptions, like so: subscription_id user_id created_at 12384 1 2018-08-10 83294 1 2018-06-03 98234 1 2018-04-08 24903 2 2018-05-08 32843 2 2018-03-06 09283 2 2018-04-07 Now I want to add a column that states how many previous subscriptions a user had, before this current subscription. For example, if this is the user's first subscription, the new column's value should be 0. If they had one subscription starting before this subscription, the new column's value

Linq - group by datetime for previous 12 months - include empty months

廉价感情. 提交于 2020-01-05 07:12:52
问题 I have a scenario whereby I need to retrieve a count of objects grouped by the month of a datetime field. I found the following post which gets me part of the way there... Linq: group by year and month, and manage empty months ...but I need to list the previous 12 months from today's date and the count of objects for each month, which is where I'm struggling. I've seen a few other posts with similar issues/solutions but I chose the above one as it's also a requirement to produce a record for

Custom groupby based on column values

断了今生、忘了曾经 提交于 2020-01-05 06:55:22
问题 Given this dataframe: C index  0   9 1   0 2   1 3   5 4   0 5 1 6 2 7 20 8 0 How can I split this into groups such that Group 1 has [9, 0] , Group 2 has [1, 5, 0] , Group 3 has [1, 2, 20, 0] ? The idea is to find all sequences that terminate with 0 and group them together. The sequences can vary in size and and the last sequence may not terminate with 0. The first element will never be 0. My end result looks something like this: C_new 9 6 23 Where I find these groups and then sum them. 回答1:

Adding total row to a pandas DataFrame with tuples inside

天涯浪子 提交于 2020-01-05 05:10:17
问题 Here is my previous question (that has been answered). It helped me for my initial problem but now I am stuck on another one. I have this below pandas.DataFrame which I try to add total rows for each sub levels. Level Company Item 1 X a (10, 20) b (10, 20) Y a (10, 20) b (10, 20) c (10, 20) 2 X a (10, 20) b (10, 20) c (10, 20) Y a (10, 20) I would like to get this : Level Company Item 1 X a (10, 20) b (10, 20) total (20, 40) Y a (10, 20) b (10, 20) c (10, 20) total (30, 60) total (50, 100)

Python Pandas : How to return grouped lists in a column as a dict

穿精又带淫゛_ 提交于 2020-01-05 04:23:17
问题 Python Pandas : How to compile all lists in a column into one unique list Starting with data from previous question: f = pd.DataFrame({'id':['a','b', 'a'], 'val':[['val1','val2'], ['val33','val9','val6'], ['val2','val6','val7']]}) print (df) id val 0 a [val1, val2] 1 b [val33, val9, val6] 2 a [val2, val6, val7] How do I get the lists into Dict: pd.Series([a for b in df.val.tolist() for a in b]).value_counts().to_dict() {'val1': 1, 'val2': 2, 'val33': 1, 'val6': 2, 'val7': 1, 'val9': 1} How do

How to use group by in symfony repository

会有一股神秘感。 提交于 2020-01-05 04:03:14
问题 I have this code in DayRepository.php : public function findAllFromThisUser($user) { $query = $this->getEntityManager() ->createQuery( 'SELECT d FROM AppBundle:Day d WHERE d.user = :user ORDER BY d.dayOfWeek ASC' )->setParameter('user', $user); try{ return $query->getResult(); } catch (\Doctrine\ORM\NoResultException $e){ return null; } } In the controller DayController.php , I have this code: /** * @Route("/days/list", name="days_list_all") */ public function listAllAction() { $user = $this-

Python pandas groupby conditional concatenate strings into multiple columns

强颜欢笑 提交于 2020-01-04 11:02:38
问题 I am trying to group by a dataframe on one column, keeping several columns from one row in each group and concatenating strings from the other rows into multiple columns based on the value of one column. Here is an example... df = pd.DataFrame({'test' : ['a','a','a','a','a','a','b','b','b','b'], 'name' : ['aa','ab','ac','ad','ae','ba','bb','bc','bd','be'], 'amount' : [1, 2, 3, 4, 5, 6, 7, 8, 9, 9.5], 'role' : ['x','y','y','x','x','z','y','y','z','y']}) df amount name role test 0 1.0 aa x a 1

MySQL GROUP BY and COUNT

微笑、不失礼 提交于 2020-01-04 09:43:51
问题 I have a small problem regarding a count after grouping some elements from a mysql table, I have an orders table .. in which each order has several rows grouped by a code (named as codcomanda) ... I have to do a query which counts the number of orders per customer and lists only the name and number of orders. This is what i came up (this might be dumb ... i'm not a pro programmer) SELECT a.nume, a.tel, ( SELECT COUNT(*) AS `count` FROM ( SELECT id AS `lwtemp` FROM lw_comenzi_confirmate AS yt

MySQL GROUP BY and COUNT

百般思念 提交于 2020-01-04 09:42:55
问题 I have a small problem regarding a count after grouping some elements from a mysql table, I have an orders table .. in which each order has several rows grouped by a code (named as codcomanda) ... I have to do a query which counts the number of orders per customer and lists only the name and number of orders. This is what i came up (this might be dumb ... i'm not a pro programmer) SELECT a.nume, a.tel, ( SELECT COUNT(*) AS `count` FROM ( SELECT id AS `lwtemp` FROM lw_comenzi_confirmate AS yt