group-by

Gaps and Islands solution in Oracle - use of recursive

六眼飞鱼酱① 提交于 2019-12-30 07:52:33
问题 I have a problem that could be easily solved using curser in Oracle. However, I wonder if that could be done using select only. I have 1 data set that contains the following fields: Start, Description, MaximumRow, SequentialOrder. The data set is ordered by Description, Start, SequentialOrder. This is the data for illustration purpose: I would like to get the following results in a different data set (Start, End, Description) where Start is the minimum of the "Start" field in a set and End is

Hibernate criteria query using Max() projection on key field and group by foreign primary key

∥☆過路亽.° 提交于 2019-12-30 06:32:08
问题 I'm having difficulty representing this query (which works on the database directly) as a criteria query in Hibernate (version 3.2.5): SELECT s.* FROM ftp_status s WHERE (s.datetime,s.connectionid) IN (SELECT MAX(f.datetime), f.connectionid FROM ftp_status f GROUP BY f.connectionid); so far this is what I've come up with that doesn't work, and throws a could not resolve property: datetime of: common.entity.FtpStatus error message: Criteria crit = s.createCriteria(FtpStatus.class); crit = crit

Pandas groupby with categories with redundant nan

一笑奈何 提交于 2019-12-29 20:17:05
问题 I am having issues using pandas groupby with categorical data. Theoretically, it should be super efficient: you are grouping and indexing via integers rather than strings. But it insists that, when grouping by multiple categories, every combination of categories must be accounted for. I sometimes use categories even when there's a low density of common strings, simply because those strings are long and it saves memory / improves performance. Sometimes there are thousands of categories in each

Select only last value using group by at mysql

删除回忆录丶 提交于 2019-12-29 08:27:08
问题 I have one table with data about attendance into some events. I have in the table the data of the attendance everytime the user sends new attendance, the information is like this: mysql> SELECT id_branch_channel, id_member, attendance, timestamp, id_member FROM view_event_attendance WHERE id_event = 782; +-------------------+-----------+------------+------------+-----------+ | id_branch_channel | id_member | attendance | timestamp | id_member | +-------------------+-----------+------------+--

Use dplyr's group_by to perform split-apply-combine

自闭症网瘾萝莉.ら 提交于 2019-12-29 07:54:08
问题 I am trying to use dplyr to do the following: tapply(iris$Petal.Length, iris$Species, shapiro.test) I want to split the Petal.Lengths by Speicies, and apply a function, in this case shapiro.test. I read this SO question and quite a number of other pages. I am sort of able to split the variable into groups, using do : iris %>% group_by(Species) %>% select(Petal.Length) %>% do(print(.$Petal.Length)) [1] 1.4 1.4 1.3 1.5 1.4 1.7 1.4 1.5 1.4 1.5 1.5 1.6 1.4 1.1 1.2 [16] 1.5 1.3 1.4 1.7 1.5 1.7 1.5

Linq GroupBy - how to specify the grouping key at runtime?

对着背影说爱祢 提交于 2019-12-29 07:13:08
问题 is there a good way to do a Linq GroupBy where the grouping key is determined at runtime? e.g. I want the grouping key to be built from a user-selected list of fields - can you do this? I know I can do it easily if I convert everything to a table of strings, but I was wondering if there was an elegant or clever way to accomplish this otherwise. class Item { public int A, B; public DateTime D; public double X, Y, Z; } I have a List<Item> called data . I want to do things like retrieve the sum

MYSQL delete all results having count(*)=1

回眸只為那壹抹淺笑 提交于 2019-12-29 04:24:08
问题 I have a table taged with two fields sesskey (varchar32 , index) and products (int11), now I have to delete all rows that having group by sesskey count(*) = 1. I'm trying a fews methods but all fails. Example: delete from taged where sesskey in (select sesskey from taged group by sesskey having count(*) = 1) The sesskey field could not be a primary key because its repeated. 回答1: DELETE si FROM t_session si JOIN ( SELECT sesskey FROM t_session so GROUP BY sesskey HAVING COUNT(*) = 1 ) q ON q

SQL count if columns

丶灬走出姿态 提交于 2019-12-28 14:29:48
问题 What is the best way to create columns which count the number of occurrences of data in a table? The table needs to be grouped by one column. I have seen SELECT sum(CASE WHEN question1 = 0 THEN 1 ELSE 0 END) AS ZERO, sum(CASE WHEN question1 = 1 THEN 1 ELSE 0 END) AS ONE, sum(CASE WHEN question1 = 2 THEN 1 ELSE 0 END) AS TWO, category FROM reviews GROUP BY category where question1 can have a value of either 0, 1 or 2. I have also seen a version of that using count(CASE WHEN question1 = 0 THEN

Pandas groupby and qcut

我的未来我决定 提交于 2019-12-28 11:45:08
问题 Is there a way to structure Pandas groupby and qcut commands to return one column that has nested tiles? Specifically, suppose I have 2 groups of data and I want qcut applied to each group and then return the output to one column. This would be similar to MS SQL Server's ntile() command that allows Partition by(). A B C 0 foo 0.1 1 1 foo 0.5 2 2 foo 1.0 3 3 bar 0.1 1 4 bar 0.5 2 5 bar 1.0 3 In the dataframe above I would like to apply the qcut function to B while partitioning on A to return C

How to apply “first” and “last” functions to columns while using group by in pandas?

社会主义新天地 提交于 2019-12-28 11:44:12
问题 I have a data frame and I would like to group it by a particular column (or, in other words, by values from a particular column). I can do it in the following way: grouped = df.groupby(['ColumnName']) . I imagine the result of this operation as a table in which some cells can contain sets of values instead of single values. To get a usual table (i.e. a table in which every cell contains only one a single value) I need to indicate what function I want to use to transform the sets of values in