group-by

Count groups of consecutive 1s in pandas

佐手、 提交于 2020-05-26 09:50:26
问题 I have a list of '1's and '0s' and I would like to calculate the number of groups of consecutive '1's. mylist = [0,0,1,1,0,1,1,1,1,0,1,0] Doing it by hand gives us 3 groups but is there a way to do it by python? 回答1: Option 1 With pandas . First, initialise a dataframe: In [78]: df Out[78]: Col1 0 0 1 0 2 1 3 1 4 0 5 1 6 1 7 1 8 1 9 0 10 1 11 0 Now calculate sum total by number of groups: In [79]: df.sum() / df.diff().eq(1).cumsum().max() Out[79]: Col1 2.333333 dtype: float64 If you want just

PostgreSQL MAX and GROUP BY

拈花ヽ惹草 提交于 2020-05-24 08:11:50
问题 I have a table with id , year and count . I want to get the MAX(count) for each id and keep the year when it happens, so I make this query: SELECT id, year, MAX(count) FROM table GROUP BY id; Unfortunately, it gives me an error: ERROR: column "table.year" must appear in the GROUP BY clause or be used in an aggregate function So I try: SELECT id, year, MAX(count) FROM table GROUP BY id, year; But then, it doesn't do MAX(count) , it just shows the table as it is. I suppose because when grouping

Splitting data into chunks and iterating over each chunk in R

霸气de小男生 提交于 2020-05-17 14:42:58
问题 I have a dataframe structured like this: birthwt tobacco01 pscore pscoreblocks blocknumber 3425 0 0.18 (0.177, 0.187] 1 3527 1 0.15 (0.158, 0.168] 2 1638 1 0.34 (0.335, 0.345] 3 Explaining the data : The birthwt column is a continuous variable measuring birth weight in grams. The tobacco01 column contains values of 0 or 1. The pscore column contains probability values between 0 and 1. The pscoreblocks takes the pscore column and breaks it down into 100 equally sized blocks. The block number

Splitting data into chunks and iterating over each chunk in R

一笑奈何 提交于 2020-05-17 14:42:33
问题 I have a dataframe structured like this: birthwt tobacco01 pscore pscoreblocks blocknumber 3425 0 0.18 (0.177, 0.187] 1 3527 1 0.15 (0.158, 0.168] 2 1638 1 0.34 (0.335, 0.345] 3 Explaining the data : The birthwt column is a continuous variable measuring birth weight in grams. The tobacco01 column contains values of 0 or 1. The pscore column contains probability values between 0 and 1. The pscoreblocks takes the pscore column and breaks it down into 100 equally sized blocks. The block number

Write a function in R to group factor levels by frequency, then keep the 2 largest categories and pool the rest in “other” [closed]

和自甴很熟 提交于 2020-05-17 06:29:53
问题 Closed . This question needs details or clarity. It is not currently accepting answers. Want to improve this question? Add details and clarify the problem by editing this post. Closed last month . I would like to write a function in R that takes a single factor variable and a parameter n as inputs, computes the number of cases per category in the factor variable, and only keeps those n categories with the most number of cases and pools all other categories into a category "other." This

Complex nested aggregations to get order totals

限于喜欢 提交于 2020-05-17 03:03:35
问题 I have a system to track orders and related expenditures. This is a Rails app running on PostgreSQL. 99% of my app gets by with plain old Rails Active Record call etc. This one is ugly. The expenditures table look like this: +----+----------+-----------+------------------------+ | id | category | parent_id | note | +----+----------+-----------+------------------------+ | 1 | order | nil | order with no invoices | +----+----------+-----------+------------------------+ | 2 | order | nil | order

Complex nested aggregations to get order totals

元气小坏坏 提交于 2020-05-17 03:01:35
问题 I have a system to track orders and related expenditures. This is a Rails app running on PostgreSQL. 99% of my app gets by with plain old Rails Active Record call etc. This one is ugly. The expenditures table look like this: +----+----------+-----------+------------------------+ | id | category | parent_id | note | +----+----------+-----------+------------------------+ | 1 | order | nil | order with no invoices | +----+----------+-----------+------------------------+ | 2 | order | nil | order

How do window functions and the group by clause interact?

笑着哭i 提交于 2020-05-14 18:06:23
问题 I do understand window functions and group by separately. But what happens when you use both a window function and a group by clause in the same query ? Are the selected rows grouped first, then considered by the window function ? Or does the window function executes first, then the resulting values are grouped by group by ? Something else ? 回答1: Quote from the manual: If the query contains any window functions, these functions are evaluated after any grouping, aggregation, and HAVING

How do window functions and the group by clause interact?

拜拜、爱过 提交于 2020-05-14 18:06:11
问题 I do understand window functions and group by separately. But what happens when you use both a window function and a group by clause in the same query ? Are the selected rows grouped first, then considered by the window function ? Or does the window function executes first, then the resulting values are grouped by group by ? Something else ? 回答1: Quote from the manual: If the query contains any window functions, these functions are evaluated after any grouping, aggregation, and HAVING

How do window functions and the group by clause interact?

老子叫甜甜 提交于 2020-05-14 18:06:06
问题 I do understand window functions and group by separately. But what happens when you use both a window function and a group by clause in the same query ? Are the selected rows grouped first, then considered by the window function ? Or does the window function executes first, then the resulting values are grouped by group by ? Something else ? 回答1: Quote from the manual: If the query contains any window functions, these functions are evaluated after any grouping, aggregation, and HAVING