group-by

aggregating hourly time series by Day via pd.TimeGrouper('D'); issue @ timestamp 00:00:00 (hour 24)

不问归期 提交于 2020-01-15 10:05:42
问题 df: hour rev datetime 2016-05-01 01:00:00 1 -0.02 2016-05-01 02:00:00 2 -0.01 2016-05-01 03:00:00 3 -0.02 2016-05-01 04:00:00 4 -0.02 2016-05-01 05:00:00 5 -0.01 2016-05-01 06:00:00 6 -0.03 2016-05-01 07:00:00 7 -0.10 2016-05-01 08:00:00 8 -0.09 2016-05-01 09:00:00 9 -0.08 2016-05-01 10:00:00 10 -0.10 2016-05-01 11:00:00 11 -0.12 2016-05-01 12:00:00 12 -0.14 2016-05-01 13:00:00 13 -0.17 2016-05-01 14:00:00 14 -0.16 2016-05-01 15:00:00 15 -0.15 2016-05-01 16:00:00 16 -0.15 2016-05-01 17:00:00

Add column from one data frame to group-by data frame in python

不问归期 提交于 2020-01-15 09:28:16
问题 I have two data frames in python. The first is raw rainfall data for a single day of year and the second is the sum of daily rainfall using group.by . One data frame looks like this (with many more rows in between device_ids): >>> df1 device_id rain day month year 0 9z849362-b05d-4317-96f5-f267c1adf8d6 0.0 31 12 2016 1 9z849362-b05d-4317-96f5-f267c1adf8d6 0.0 31 12 2016 6 e7z581f0-2693-42ad-9896-0048550ccda7 0.0 31 12 2016 11 e7z581f0-2693-42ad-9896-0048550ccda7 0.0 31 12 2016 12 ceez972b

last occurrence of particular value with if else statement by group

六月ゝ 毕业季﹏ 提交于 2020-01-15 09:17:28
问题 Lets say I have the following data: library(tidyverse) cf <- data.frame(x = c("a", "a", "a", "a", "b", "b", "b", "b", "c", "c", "c", "c", "d", "d", "e", "e"), y = c("free", "with", "sus", "sus", "free", "with", "free", "sus", "sus", "with", "free", "sus", "free", "free", "with", "sus")) > cf x y 1 a free 2 a with 3 a sus 4 a sus 5 b free 6 b with 7 b free 8 b sus 9 c sus 10 c with 11 c free 12 c sus 13 d free 14 d free 15 e with 16 e sus I want to obtain an indicator variable equal to 1 if

Groupby time interval and find unique IDs with similar min values (entry time values)

谁都会走 提交于 2020-01-15 05:45:14
问题 Datanovice helped me in this post Determining group size based entry and exit times of IDs in my df, so I got further with my problem. But, how can I now group the dataset (see subset below) into datetime seconds and look at the 'min' values of the IDs and count the unique IDs in the grouped 'date' second that have common 'min' values with a flexibility of a minute, for example. Is there any smart way to do this? t_code date x y id min max 4632 2019-09-17 10:17:10 209 201 5170 2019-09-17 09

segmenting or grouping a df based on parameters or differences within columns going down the dataframe rows?

僤鯓⒐⒋嵵緔 提交于 2020-01-15 05:27:06
问题 I was trying to figure out if there was a way in which where I had a dataframe with multiple fields and I wanted to segment or group the dataframe into a new dataframe based on if the values of specific columns were within x amount of each other? I.D | Created_Time | Home_Longitude | Home_Latitude | Work_Longitude | Home_Latitude Faa1 2019-02-23 20:01:13.362 -77.0364 38.8951 -72.0364 38.8951 Above is how the original df looks with multiple rows. I want to create a new dataframe where all rows

segmenting or grouping a df based on parameters or differences within columns going down the dataframe rows?

本秂侑毒 提交于 2020-01-15 05:27:04
问题 I was trying to figure out if there was a way in which where I had a dataframe with multiple fields and I wanted to segment or group the dataframe into a new dataframe based on if the values of specific columns were within x amount of each other? I.D | Created_Time | Home_Longitude | Home_Latitude | Work_Longitude | Home_Latitude Faa1 2019-02-23 20:01:13.362 -77.0364 38.8951 -72.0364 38.8951 Above is how the original df looks with multiple rows. I want to create a new dataframe where all rows

Python: how to group similar lists together in a list of lists?

谁说胖子不能爱 提交于 2020-01-15 05:10:07
问题 I have a list of lists in python. I want to group similar lists together. That is, if first three elements of each list are the same then those three lists should go in one group. For eg [["a", "b", "c", 1, 2], ["d", "f", "g", 8, 9], ["a", "b", "c", 3, 4], ["d","f", "g", 3, 4], ["a", "b", "c", 5, 6]] I want this to look like [[["a", "b", "c", 1, 2], ["a", "b", "c", 5, 6], ["a", "b", "c", 3, 4]], [["d","f", "g", 3, 4], ["d", "f", "g", 8, 9]]] I could do this by running an iterator and manually

Get count of row and sum group by quarter of date, if another column doesnt exits a value in SQL Server

心已入冬 提交于 2020-01-15 05:01:11
问题 I have some sample data: Date Status OfferNum Amount ------------------------------------------------------ 2016/10/30 - 1 - 2000 - 1000,00 2016/08/25 - 0 - 2000 - 1100,00 2016/07/12 - 0 - 2001 - 1200,00 2016/08/30 - 0 - 2001 - 1300,00 2016/07/12 - 1 - 2002 - 1400,00 2016/08/30 - 1 - 2002 - 1500,00 2016/08/30 - 1 - 2003 - 1600,00 I don't want to count if one of offerNum status value has 1 and in the same quarter(if it has 1 but it isnt same quarter it has to be count). But I want to sum all

NHibernate GroupBy and Sum

人盡茶涼 提交于 2020-01-14 14:48:07
问题 I am starting a study on NHibernate, and I have a problem I'm not able to solve, I wonder if someone could help me. The mapping is working "correctly" but when I try to do the grouping and the sum, the application returns the following error: "could not resolve property: Course.Price of: Persistence.POCO.RequestDetail" var criteria = session.CreateCriteria(typeof(RequestDetail)) .SetProjection( Projections.ProjectionList() .Add(Projections.RowCount(), "RowCount") .Add(Projections.Sum("Course

Pandas Groupby Unique Multiple Columns

落花浮王杯 提交于 2020-01-14 14:15:06
问题 I have a dataframe. import pandas as pd df = pd.DataFrame( {'number': [0,0,0,1,1,2,2,2,2], 'id1': [100,100,100,300,400,700,700,800,700], 'id2': [100,100,200,500,600,700,800,900,1000]}) id1 id2 number 0 100 100 0 1 100 100 0 2 100 200 0 3 300 500 1 4 400 600 1 5 700 700 2 6 700 800 2 7 800 900 2 8 700 1000 2 (This represents a much larger dataframe I am working with ~millions of rows). I can apply a groupby().unique to one column: df.groupby(['number'])['id1'].unique() number 0 [100] 1 [300,