group-by

MySQL selecting max record in group by

自作多情 提交于 2020-01-12 04:53:08
问题 I am trying to create a query in a table that has some 500,000 records and some 50 or 60 columns. What I need is to collate these records into groups and select the max record in each group. To simplify the problem I have a table as follows +----+-------------+----------+--------+ | id | external_id | group_id | mypath | +----+-------------+----------+--------+ | 1 | 1003 | 1 | a | | 2 | 1004 | 2 | b | | 3 | 1005 | 2 | c | +----+-------------+----------+--------+ The simple group by is as

Counting multiple rows in MySQL in one query

倖福魔咒の 提交于 2020-01-12 03:49:26
问题 I currently have a table which stores a load of statistics such as views, downloads, purchases etc. for a multiple number of items. To get a single operation count on each item I can use the following query: SELECT *, COUNT(*) FROM stats WHERE operation = 'view' GROUP BY item_id This gives me all the items and a count of their views. I can then change 'view' to 'purchase' or 'download' for the other variables. However this means three separate calls to the database. Is it possible to get all

MySQL: how to get x number of results per grouping [duplicate]

ε祈祈猫儿з 提交于 2020-01-11 20:00:27
问题 This question already has answers here : Closed 7 years ago . Possible Duplicate: mysql: Using LIMIT within GROUP BY to get N results per group? I have a two tables: Items Categories Each item belongs to a category. What I want to do is select 5 items per category but say 20 items in total. SELECT item_id, item_name, items.catid FROM items, categories WHERE items.catid = categories.catid GROUP BY items.catid LIMIT 0,5 //5 per category group Edit: if there are more than 5 items per category -

How to find duplicate names using pandas?

帅比萌擦擦* 提交于 2020-01-11 17:40:12
问题 I have a pandas.DataFrame with a column called name containing strings. I would like to get a list of the names which occur more than once in the column. How do I do that? I tried: funcs_groups = funcs.groupby(funcs.name) funcs_groups[(funcs_groups.count().name>1)] But it doesn't filter out the singleton names. 回答1: If you want to find the rows with duplicated name (except the first time we see that), you can try this In [16]: import pandas as pd In [17]: p1 = {'name': 'willy', 'age': 10} In

Finding first occurence of multiples value in every group sort by date in SQL

我们两清 提交于 2020-01-11 13:05:50
问题 I have a table with every operations that appends before an event group by another value. There is only 3 operations: R, E, P + ----------+----------+-----------+------------------------+ | Rollcycle | Blocking | Operation | Order | + ----------+----------+-----------+------------------------+ | 1 | 3 | R | 4 | | 1 | 3 | P | 3 | | 1 | 3 | E | 2 | | 1 | 3 | R | 1 | | 1 | 2 | P | 3 | | 1 | 2 | E | 2 | | 1 | 2 | R | 1 | | 1 | 1 | R | 1 | | 2 | 1 | E | 2 | | 2 | 1 | R | 1 | + ----------+---------

Plotting pandas groupby output using matplotlib subplots

蓝咒 提交于 2020-01-11 11:21:48
问题 I have a dataframe,df2 which has 6 rows and 1591 columns 0.0.0 10.1.21 1.5.12 3.7.8 3.5.8 1.7.8 ... June 1 1 4 0 0 4 July 0 0 0 0 0 0 August 54 0 9 0 5 0 September 22 0 6 0 0 1 October 0 9 5 1 4 0 I want to plot multiple of 3 columns in each panel in a figure as a stacked bar. that is column: 0.0.0 to 1.5.12 to be plotted in a separate panel and column:3.7.8 to 1.7.8 in another panel. Here is the code: df= df2 df['key1'] = 0 df.key1.loc[:, ['0.0.0', '10.1.21', '1.5.12']].values = 1 df.key1

MySQL, multiple rows to separate fields

放肆的年华 提交于 2020-01-11 08:52:48
问题 I have a MySQL table with fields and data such as follows; PartNumber Priority SupName a1 0 One a2 0 One a2 1 Two a3 0 One a4 1 Two a5 2 Three I am trying to create a view where the parts that have multiple rows are combined into a single row, and into separate fields such as Ideally This; PartNumber Sup1 Sup2 Sup3 a1 One NULL NULL a2 One Two NULL a3 One NULL NULL a4 Two NULL NULL a5 Three NULL NULL Or I can live with this PartNumber Sup1 Sup2 Sup3 a1 One NULL NULL a2 One Two NULL a3 One NULL

Sequential Group By in sql server

巧了我就是萌 提交于 2020-01-11 07:47:29
问题 For this Table: +----+--------+-------+ | ID | Status | Value | +----+--------+-------+ | 1 | 1 | 4 | | 2 | 1 | 7 | | 3 | 1 | 9 | | 4 | 2 | 1 | | 5 | 2 | 7 | | 6 | 1 | 8 | | 7 | 1 | 9 | | 8 | 2 | 1 | | 9 | 0 | 4 | | 10 | 0 | 3 | | 11 | 0 | 8 | | 12 | 1 | 9 | | 13 | 3 | 1 | +----+--------+-------+ I need to sum sequential groups with the same Status to produce this result. +--------+------------+ | Status | Sum(Value) | +--------+------------+ | 1 | 20 | | 2 | 8 | | 1 | 17 | | 2 | 1 | | 0 | 15

MYSQL last login and number of logins in last 3 months

*爱你&永不变心* 提交于 2020-01-11 06:32:11
问题 I'm trying to write a query to join a user table to an activity logging table and return the following for each user: A) The time they last logged in. B) The number of logins in the last 3 months. This is what I have come up with so far: SELECT A.UserID, COUNT( Activity ) AS Logins, MAX( TIME ) AS LastLogin FROM UserMaster A LEFT JOIN UserWebActivity B ON A.UserID = B.UserID AND Activity = 'Login' AND TIME BETWEEN DATE_SUB( NOW( ) , INTERVAL 3 MONTH ) AND NOW( ) GROUP BY A.UserID This almost

Why does `summarize` drop a group?

假如想象 提交于 2020-01-11 06:13:10
问题 I'm fooling around with babynames pkg. A group_by command works, but after the summarize , one of the groups is dropped from the group list. library(babynames) babynames[1:10000, ] %>% group_by(year, name) %>% head(1) # A tibble: 1 x 5 # Groups: year, name [1] year sex name n prop <dbl> <chr> <chr> <int> <dbl> 1 1880 F Mary 7065 0.07238433 This is fine---two groups, year, name . But after a summarize (which respects the groups correctly), the name group is dropped. Am I missing an easy