group-by | 易学教程

MySQL selecting max record in group by

阅读更多关于 MySQL selecting max record in group by

问题 I am trying to create a query in a table that has some 500,000 records and some 50 or 60 columns. What I need is to collate these records into groups and select the max record in each group. To simplify the problem I have a table as follows +----+-------------+----------+--------+ | id | external_id | group_id | mypath | +----+-------------+----------+--------+ | 1 | 1003 | 1 | a | | 2 | 1004 | 2 | b | | 3 | 1005 | 2 | c | +----+-------------+----------+--------+ The simple group by is as

Counting multiple rows in MySQL in one query

阅读更多关于 Counting multiple rows in MySQL in one query

问题 I currently have a table which stores a load of statistics such as views, downloads, purchases etc. for a multiple number of items. To get a single operation count on each item I can use the following query: SELECT *, COUNT(*) FROM stats WHERE operation = 'view' GROUP BY item_id This gives me all the items and a count of their views. I can then change 'view' to 'purchase' or 'download' for the other variables. However this means three separate calls to the database. Is it possible to get all

MySQL: how to get x number of results per grouping [duplicate]

阅读更多关于 MySQL: how to get x number of results per grouping [duplicate]

问题 This question already has answers here : Closed 7 years ago . Possible Duplicate: mysql: Using LIMIT within GROUP BY to get N results per group? I have a two tables: Items Categories Each item belongs to a category. What I want to do is select 5 items per category but say 20 items in total. SELECT item_id, item_name, items.catid FROM items, categories WHERE items.catid = categories.catid GROUP BY items.catid LIMIT 0,5 //5 per category group Edit: if there are more than 5 items per category -

How to find duplicate names using pandas?

阅读更多关于 How to find duplicate names using pandas?

问题 I have a pandas.DataFrame with a column called name containing strings. I would like to get a list of the names which occur more than once in the column. How do I do that? I tried: funcs_groups = funcs.groupby(funcs.name) funcs_groups[(funcs_groups.count().name>1)] But it doesn't filter out the singleton names. 回答1: If you want to find the rows with duplicated name (except the first time we see that), you can try this In [16]: import pandas as pd In [17]: p1 = {'name': 'willy', 'age': 10} In

Finding first occurence of multiples value in every group sort by date in SQL

阅读更多关于 Finding first occurence of multiples value in every group sort by date in SQL

问题 I have a table with every operations that appends before an event group by another value. There is only 3 operations: R, E, P + ----------+----------+-----------+------------------------+ | Rollcycle | Blocking | Operation | Order | + ----------+----------+-----------+------------------------+ | 1 | 3 | R | 4 | | 1 | 3 | P | 3 | | 1 | 3 | E | 2 | | 1 | 3 | R | 1 | | 1 | 2 | P | 3 | | 1 | 2 | E | 2 | | 1 | 2 | R | 1 | | 1 | 1 | R | 1 | | 2 | 1 | E | 2 | | 2 | 1 | R | 1 | + ----------+---------

Plotting pandas groupby output using matplotlib subplots

阅读更多关于 Plotting pandas groupby output using matplotlib subplots

问题 I have a dataframe,df2 which has 6 rows and 1591 columns 0.0.0 10.1.21 1.5.12 3.7.8 3.5.8 1.7.8 ... June 1 1 4 0 0 4 July 0 0 0 0 0 0 August 54 0 9 0 5 0 September 22 0 6 0 0 1 October 0 9 5 1 4 0 I want to plot multiple of 3 columns in each panel in a figure as a stacked bar. that is column: 0.0.0 to 1.5.12 to be plotted in a separate panel and column:3.7.8 to 1.7.8 in another panel. Here is the code: df= df2 df['key1'] = 0 df.key1.loc[:, ['0.0.0', '10.1.21', '1.5.12']].values = 1 df.key1

MySQL, multiple rows to separate fields

阅读更多关于 MySQL, multiple rows to separate fields

问题 I have a MySQL table with fields and data such as follows; PartNumber Priority SupName a1 0 One a2 0 One a2 1 Two a3 0 One a4 1 Two a5 2 Three I am trying to create a view where the parts that have multiple rows are combined into a single row, and into separate fields such as Ideally This; PartNumber Sup1 Sup2 Sup3 a1 One NULL NULL a2 One Two NULL a3 One NULL NULL a4 Two NULL NULL a5 Three NULL NULL Or I can live with this PartNumber Sup1 Sup2 Sup3 a1 One NULL NULL a2 One Two NULL a3 One NULL

Sequential Group By in sql server

阅读更多关于 Sequential Group By in sql server

问题 For this Table: +----+--------+-------+ | ID | Status | Value | +----+--------+-------+ | 1 | 1 | 4 | | 2 | 1 | 7 | | 3 | 1 | 9 | | 4 | 2 | 1 | | 5 | 2 | 7 | | 6 | 1 | 8 | | 7 | 1 | 9 | | 8 | 2 | 1 | | 9 | 0 | 4 | | 10 | 0 | 3 | | 11 | 0 | 8 | | 12 | 1 | 9 | | 13 | 3 | 1 | +----+--------+-------+ I need to sum sequential groups with the same Status to produce this result. +--------+------------+ | Status | Sum(Value) | +--------+------------+ | 1 | 20 | | 2 | 8 | | 1 | 17 | | 2 | 1 | | 0 | 15

MYSQL last login and number of logins in last 3 months

阅读更多关于 MYSQL last login and number of logins in last 3 months

问题 I'm trying to write a query to join a user table to an activity logging table and return the following for each user: A) The time they last logged in. B) The number of logins in the last 3 months. This is what I have come up with so far: SELECT A.UserID, COUNT( Activity ) AS Logins, MAX( TIME ) AS LastLogin FROM UserMaster A LEFT JOIN UserWebActivity B ON A.UserID = B.UserID AND Activity = 'Login' AND TIME BETWEEN DATE_SUB( NOW( ) , INTERVAL 3 MONTH ) AND NOW( ) GROUP BY A.UserID This almost

Why does `summarize` drop a group?

阅读更多关于 Why does `summarize` drop a group?

问题 I'm fooling around with babynames pkg. A group_by command works, but after the summarize , one of the groups is dropped from the group list. library(babynames) babynames[1:10000, ] %>% group_by(year, name) %>% head(1) # A tibble: 1 x 5 # Groups: year, name [1] year sex name n prop <dbl> <chr> <chr> <int> <dbl> 1 1880 F Mary 7065 0.07238433 This is fine---two groups, year, name . But after a summarize (which respects the groups correctly), the name group is dropped. Am I missing an easy