group-by | 易学教程

Gaps and Islands solution in Oracle - use of recursive

阅读更多关于 Gaps and Islands solution in Oracle - use of recursive

问题 I have a problem that could be easily solved using curser in Oracle. However, I wonder if that could be done using select only. I have 1 data set that contains the following fields: Start, Description, MaximumRow, SequentialOrder. The data set is ordered by Description, Start, SequentialOrder. This is the data for illustration purpose: I would like to get the following results in a different data set (Start, End, Description) where Start is the minimum of the "Start" field in a set and End is

Hibernate criteria query using Max() projection on key field and group by foreign primary key

阅读更多关于 Hibernate criteria query using Max() projection on key field and group by foreign primary key

问题 I'm having difficulty representing this query (which works on the database directly) as a criteria query in Hibernate (version 3.2.5): SELECT s.* FROM ftp_status s WHERE (s.datetime,s.connectionid) IN (SELECT MAX(f.datetime), f.connectionid FROM ftp_status f GROUP BY f.connectionid); so far this is what I've come up with that doesn't work, and throws a could not resolve property: datetime of: common.entity.FtpStatus error message: Criteria crit = s.createCriteria(FtpStatus.class); crit = crit

Pandas groupby with categories with redundant nan

阅读更多关于 Pandas groupby with categories with redundant nan

问题 I am having issues using pandas groupby with categorical data. Theoretically, it should be super efficient: you are grouping and indexing via integers rather than strings. But it insists that, when grouping by multiple categories, every combination of categories must be accounted for. I sometimes use categories even when there's a low density of common strings, simply because those strings are long and it saves memory / improves performance. Sometimes there are thousands of categories in each

Select only last value using group by at mysql

阅读更多关于 Select only last value using group by at mysql

问题 I have one table with data about attendance into some events. I have in the table the data of the attendance everytime the user sends new attendance, the information is like this: mysql> SELECT id_branch_channel, id_member, attendance, timestamp, id_member FROM view_event_attendance WHERE id_event = 782; +-------------------+-----------+------------+------------+-----------+ | id_branch_channel | id_member | attendance | timestamp | id_member | +-------------------+-----------+------------+--

Use dplyr's group_by to perform split-apply-combine

阅读更多关于 Use dplyr's group_by to perform split-apply-combine

问题 I am trying to use dplyr to do the following: tapply(iris$Petal.Length, iris$Species, shapiro.test) I want to split the Petal.Lengths by Speicies, and apply a function, in this case shapiro.test. I read this SO question and quite a number of other pages. I am sort of able to split the variable into groups, using do : iris %>% group_by(Species) %>% select(Petal.Length) %>% do(print(.$Petal.Length)) [1] 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 1.5 1.6 1.4 1.1 1.2 [16] 1.5 1.3 1.4 1.7 1.5 1.7 1.5

Linq GroupBy - how to specify the grouping key at runtime?

阅读更多关于 Linq GroupBy - how to specify the grouping key at runtime?

问题 is there a good way to do a Linq GroupBy where the grouping key is determined at runtime? e.g. I want the grouping key to be built from a user-selected list of fields - can you do this? I know I can do it easily if I convert everything to a table of strings, but I was wondering if there was an elegant or clever way to accomplish this otherwise. class Item { public int A, B; public DateTime D; public double X, Y, Z; } I have a List<Item> called data . I want to do things like retrieve the sum

MYSQL delete all results having count(*)=1

阅读更多关于 MYSQL delete all results having count(*)=1

问题 I have a table taged with two fields sesskey (varchar32 , index) and products (int11), now I have to delete all rows that having group by sesskey count(*) = 1. I'm trying a fews methods but all fails. Example: delete from taged where sesskey in (select sesskey from taged group by sesskey having count(*) = 1) The sesskey field could not be a primary key because its repeated. 回答1: DELETE si FROM t_session si JOIN ( SELECT sesskey FROM t_session so GROUP BY sesskey HAVING COUNT(*) = 1 ) q ON q

SQL count if columns

阅读更多关于 SQL count if columns

问题 What is the best way to create columns which count the number of occurrences of data in a table? The table needs to be grouped by one column. I have seen SELECT sum(CASE WHEN question1 = 0 THEN 1 ELSE 0 END) AS ZERO, sum(CASE WHEN question1 = 1 THEN 1 ELSE 0 END) AS ONE, sum(CASE WHEN question1 = 2 THEN 1 ELSE 0 END) AS TWO, category FROM reviews GROUP BY category where question1 can have a value of either 0, 1 or 2. I have also seen a version of that using count(CASE WHEN question1 = 0 THEN

Pandas groupby and qcut

阅读更多关于 Pandas groupby and qcut

问题 Is there a way to structure Pandas groupby and qcut commands to return one column that has nested tiles? Specifically, suppose I have 2 groups of data and I want qcut applied to each group and then return the output to one column. This would be similar to MS SQL Server's ntile() command that allows Partition by(). A B C 0 foo 0.1 1 1 foo 0.5 2 2 foo 1.0 3 3 bar 0.1 1 4 bar 0.5 2 5 bar 1.0 3 In the dataframe above I would like to apply the qcut function to B while partitioning on A to return C

How to apply “first” and “last” functions to columns while using group by in pandas?

阅读更多关于 How to apply “first” and “last” functions to columns while using group by in pandas?

问题 I have a data frame and I would like to group it by a particular column (or, in other words, by values from a particular column). I can do it in the following way: grouped = df.groupby(['ColumnName']) . I imagine the result of this operation as a table in which some cells can contain sets of values instead of single values. To get a usual table (i.e. a table in which every cell contains only one a single value) I need to indicate what function I want to use to transform the sets of values in