group-by

Apply a summarise condition to a range of columns when using dplyr group_by?

我与影子孤独终老i 提交于 2020-04-05 00:01:56
问题 Suppose we want to group_by() and summarise a massive data.frame with very many columns, but that there are some large groups of consecutive columns that will have the same summarise condition (e.g. max , mean etc) Is there a way to avoid having to specify the summarise condition for each and every column, and instead do it for ranges of columns ? Example Suppose we want to do this: iris %>% group_by(Species) %>% summarise(max(Sepal.Length), mean(Sepal.Width), mean(Petal.Length), mean(Petal

Apply a summarise condition to a range of columns when using dplyr group_by?

◇◆丶佛笑我妖孽 提交于 2020-04-04 23:59:32
问题 Suppose we want to group_by() and summarise a massive data.frame with very many columns, but that there are some large groups of consecutive columns that will have the same summarise condition (e.g. max , mean etc) Is there a way to avoid having to specify the summarise condition for each and every column, and instead do it for ranges of columns ? Example Suppose we want to do this: iris %>% group_by(Species) %>% summarise(max(Sepal.Length), mean(Sepal.Width), mean(Petal.Length), mean(Petal

Store Linq function into Variable & define on the fly?

。_饼干妹妹 提交于 2020-03-22 09:04:12
问题 I have a Linq query like this var results= StudentsList.GroupBy(x=> x.GroupID) .GroupBy(x=> x.Any(g=>g.IsQualified== true)) .Select(g=> g) .ToList(); I want to store the part x.Any(g=>g.IsQualified== true) into a variable so that I can change it on the fly (example: x.Any(g=>g.StudentName== "John") ) based on my requirement and without defining a new Linq query separately. Is that possible? Pseudo Code static void SomeFunction(Func<int, int> op) { var results= StudentsList.GroupBy(x=> x

Limit number of items in group().by() in gremlin query

吃可爱长大的小学妹 提交于 2020-03-21 20:28:57
问题 I am trying to run a gremlin query which groups vertices of a certain label into several groups by a certain field (assume it is 'displayName') and limit the number of groups to n and the number of items in each group also to n . Is there a way to achieve that? Since group().by() returns a list of the item, I tried using unfold() and then applying limit on the inner items. I managed to limit the number of groups that are returned, but couldn't limit the number of items in each group. Here's

Limit number of items in group().by() in gremlin query

旧街凉风 提交于 2020-03-21 20:28:47
问题 I am trying to run a gremlin query which groups vertices of a certain label into several groups by a certain field (assume it is 'displayName') and limit the number of groups to n and the number of items in each group also to n . Is there a way to achieve that? Since group().by() returns a list of the item, I tried using unfold() and then applying limit on the inner items. I managed to limit the number of groups that are returned, but couldn't limit the number of items in each group. Here's

Python pandas dataframe group by based on a condition

放肆的年华 提交于 2020-03-17 07:59:09
问题 My question is simple, I have a dataframe and I groupby the results based on a column and get the size like this: df.groupby('column').size() Now the problem is that I only want the ones where size is greater than X . I am wondering if I can do it using a lambda function or anything similar? I have already tried this: df.groupby('column').size() > X and it prints out some True and False values. 回答1: The grouped result is a regular DataFrame, so just filter the results as usual: import pandas

Python pandas dataframe group by based on a condition

半腔热情 提交于 2020-03-17 07:58:28
问题 My question is simple, I have a dataframe and I groupby the results based on a column and get the size like this: df.groupby('column').size() Now the problem is that I only want the ones where size is greater than X . I am wondering if I can do it using a lambda function or anything similar? I have already tried this: df.groupby('column').size() > X and it prints out some True and False values. 回答1: The grouped result is a regular DataFrame, so just filter the results as usual: import pandas

Group by DataFrame with list and sum

落花浮王杯 提交于 2020-03-16 11:33:40
问题 I have a pandas Dataframe df and I want to Group by text column with aggregation of: Stack the english_word and return the list Sum the count column Now I only can do either making the english_word list or sum the count column. I try to do that, but it return error. How to do both of that aggregation? In simple, what I want: text saya eat chicken english_word [eat,chicken] count 2 df.groupby('text', as_index=False).agg({'count' : lambda x: x.sum(), 'english_word' : lambda x: x.list()}) This

Group by DataFrame with list and sum

喜欢而已 提交于 2020-03-16 11:33:38
问题 I have a pandas Dataframe df and I want to Group by text column with aggregation of: Stack the english_word and return the list Sum the count column Now I only can do either making the english_word list or sum the count column. I try to do that, but it return error. How to do both of that aggregation? In simple, what I want: text saya eat chicken english_word [eat,chicken] count 2 df.groupby('text', as_index=False).agg({'count' : lambda x: x.sum(), 'english_word' : lambda x: x.list()}) This

Group by DataFrame with list and sum

拥有回忆 提交于 2020-03-16 11:32:07
问题 I have a pandas Dataframe df and I want to Group by text column with aggregation of: Stack the english_word and return the list Sum the count column Now I only can do either making the english_word list or sum the count column. I try to do that, but it return error. How to do both of that aggregation? In simple, what I want: text saya eat chicken english_word [eat,chicken] count 2 df.groupby('text', as_index=False).agg({'count' : lambda x: x.sum(), 'english_word' : lambda x: x.list()}) This