group-by

How does Group by and Having works

只谈情不闲聊 提交于 2019-12-13 22:22:16
问题 I am new to SQL and after writing some queries I wanted to understand how SQL "internally" processes the queries. I take one query from another post in stackoverflow: select name from contacts group by name having count(*) > 1 My question is: group by name merges all rows with the same name into one row, how does then count know how many rows with the same name were merged. I am trying to split all steps in the processing of the query in order to understand how it is exactly working, but in

python pandas: how to group by and count with a condition for every value in a column?

狂风中的少年 提交于 2019-12-13 21:26:51
问题 I have table like this: d group 1 a 2 b 3 a 4 c 5 f and I like to iterate over values of d and count number of rows that have group=a . Here is what I am doing now, but It does not work: for index,row in df.iterrows(): for x in (1,5): if row['d'] > x: row['tp'] = df.groupby('group').agg(lambda x:x.manual_type=='a') Can anybody help? 回答1: try: df['group'].value_counts()['a'] in general, you should NEVER use for loops in pandas. it's inefficient and usually recreating some existing

Oracle SQL Grouping/Ordering

喜你入骨 提交于 2019-12-13 21:09:58
问题 I'm looking for some help in writing an Oracle SQL statement to accomplish the following ... Let's say I have the following data: YEAR | PLACE 1984 | somewhere 1983 | work 1985 | somewhere 1982 | home 1984 | work 1983 | home 1984 | somewhere How can I get a result that keeps all the PLACE column values together and orders it by the YEAR column ... so the result I'm looking for is: YEAR | PLACE 1982 | home 1983 | home 1983 | work 1984 | work 1984 | somewhere 1984 | somewhere 1985 | somewhere

Problem when grouping

穿精又带淫゛_ 提交于 2019-12-13 17:53:15
问题 I have this MySql query : SELECT forum_categories.title, forum_messages.author, forum_messages.date AS last_message FROM forum_categories JOIN forum_topics ON forum_topics.category_id=forum_categories.id JOIN forum_messages ON forum_messages.topic_id=forum_topics.id WHERE forum_categories.id=6 ORDER BY forum_categories.date ASC And the output is the follow : Welcome daniel 2010-07-09 22:14:49 Welcome daniel 2010-06-29 22:14:49 Welcome luke 2010-08-10 20:12:20 Welcome skywalker 2010-08-19 22

How to group by in SQL by largest date (Order By a Group By)

五迷三道 提交于 2019-12-13 17:40:16
问题 I have the following database table Here is my sample data I have for it. What I am trying to figure out how to do is how to write a query to select all apntoken for userid='20' grouping by deviceid and then by apntoken as well (except that it should show the most recent apntoken). Some queries I have tried are this. SELECT DISTINCT apntoken,deviceid,created FROM `distribution_mobiletokens` WHERE userid='20' GROUP BY deviceid This returns the following result. Notice the date is not the

Oracle group by only ONE column

徘徊边缘 提交于 2019-12-13 16:21:51
问题 I have a table in Oracle database, which have 40 columns. I know that if I want to do a group by query, all the columns in select must be in group by. I simply just want to do: select col1, col2, col3, col4, col5 from table group by col3 If I try: select col1, col2, col3, col4, col5 from table group by col1, col2, col3, col4, col5 It does not give the required output. I have searched this, but did not find any solution. All the queries that I found using some kind of Add() or count(*)

MySQL - Group by multiple rows

眉间皱痕 提交于 2019-12-13 16:12:53
问题 I have an online survey for my users, and every time user answers a survey, I would capture their details in a "survey_stats" table like this - | id | user_id | survey_id | key | value | |---------|---------|-----------|--------------|---------| | 1 | 10 | 99 | gender | male | | 2 | 10 | 99 | age | 32 | | 3 | 10 | 99 | relationship | married | | 4 | 11 | 99 | gender | female | | 5 | 11 | 99 | age | 27 | | 6 | 11 | 99 | relationship | single | In other words, when a user answers a survey, I

Find Most Common Value and Corresponding Count Using Spark Groupby Aggregates

梦想的初衷 提交于 2019-12-13 15:44:53
问题 I am trying to use Spark (Scala) dataframes to do groupby aggregates for mode and the corresponding count. For example, Suppose we have the following dataframe: Category Color Number Letter 1 Red 4 A 1 Yellow Null B 3 Green 8 C 2 Blue Null A 1 Green 9 A 3 Green 8 B 3 Yellow Null C 2 Blue 9 B 3 Blue 8 B 1 Blue Null Null 1 Red 7 C 2 Green Null C 1 Yellow 7 Null 3 Red Null B Now we want to group by Category, then Color, and then find the size of the grouping, count of number non-nulls, the total

Multi Join in a single SQL query

允我心安 提交于 2019-12-13 15:40:23
问题 Below is the data in TestingTable1 sorted by date in descending order always BUYER_ID | ITEM_ID | CREATED_TIME ----------+-----------------+---------------------- 1345653 110909316904 2012-07-09 21:29:06 1345653 151851771618 2012-07-09 19:57:33 1345653 221065796761 2012-07-09 19:31:48 1345653 400307563710 2012-07-09 18:57:33 And if this is the below data in TestingTable2 sorted by date in descending order always USER_ID | PRODUCT_ID | LAST_TIME ---------+----------------+---------------------

Select a random entry from a group after grouping by a value (not column)?

本秂侑毒 提交于 2019-12-13 15:24:09
问题 I want to write a query using Postgres and PostGIS. I'm also using Rails with rgeo , rgeo-activerecord and activerecord-postgis-adapter , but the Rails stuff is rather unimportant. The table structure: measurement - int id - int anchor_id - Point groundtruth - data (not important for the query) Example data: id | anchor_id | groundtruth | data ----------------------------------- 1 | 1 | POINT(1 4) | ... 2 | 3 | POINT(1 4) | ... 3 | 2 | POINT(1 4) | ... 4 | 3 | POINT(1 4) | ... ---------------