aggregate

Select row prior to first occurrence of an event by group

依然范特西╮ 提交于 2019-12-11 00:01:35
问题 I have a series of observations that describe if and when an animal is spotted in a specific area. The following sample table identifies when a certain animal is seen ( status == 1 ) or not ( status == 0 ) by day. id date status 1 1 2014-06-20 1 2 1 2014-06-21 1 3 1 2014-06-22 1 4 1 2014-06-23 1 5 1 2014-06-24 0 6 2 2014-06-20 1 7 2 2014-06-21 1 8 2 2014-06-22 0 9 2 2014-06-23 1 10 2 2014-06-24 1 11 3 2014-06-20 1 12 3 2014-06-21 1 13 3 2014-06-22 0 14 3 2014-06-23 1 15 3 2014-06-24 0 16 4

add rows in a data.table but not when certain columns take same values

早过忘川 提交于 2019-12-10 23:58:03
问题 I have a data.table dat with 4 columns, say ( col1 , col2 , col3 , col4 ). Input data: structure(list(col1 = c(5.1, 5.1, 4.7, 4.6, 5, 5.1, 5.1, 4.7, 4.6, 5), col2 = c(3.5, 3.5, 3.2, 3.1, 3.6, 3.5, 3.5, 3.2, 3.1, 3.6), col3 = c(1.4, 1.4, 1.3, 1.5, 1.4, 3.4, 3.4, 1.3, 1.5, 1.4 ), col4 = structure(c(1L, 1L, 1L, 1L, 1L, 4L, 4L, 4L, 4L, 4L), .Label = c("setosa", "versicolor", "virginica", "eer"), class = "factor")), .Names = c("col1", "col2", "col3", "col4"), row.names = c(NA, -10L), class = c(

Aggregate a data frame in R by equally spaced time intervals

|▌冷眼眸甩不掉的悲伤 提交于 2019-12-10 23:17:46
问题 I want to aggregate the data by time and create equally spaced time intervals: date<- c(as.POSIXct("2011-08-08 21:00:00"), as.POSIXct("2011-08-08 21:26:00")) value<-c(1,2) dt<-data.frame(date, value) DT<-aggregate(cbind(dt$value),list(cut(dt$date, breaks="10 min")),sum) dt: 2011-08-08 21:00:00 1 2011-08-08 21:26:00 2 DT: 2011-08-08 21:00:00 1 2011-08-08 21:20:00 2 What I want: 2011-08-08 21:00:00 1 2011-08-08 21:10:00 NA 2011-08-08 21:20:00 2 Is there anyway to do this without using zoo or

Error when calculating a running total (cumulative over the previous periods)

一曲冷凌霜 提交于 2019-12-10 23:16:10
问题 I have a table, let's call it My_Table that has a Created datetime column (in SQL Server) that I'm trying to pull a report that shows historically how many rows were to My_Table by month over a particular time. Now I know that I can show how many were added each month with: SELECT YEAR(MT.Created), MONTH(MT.Created), COUNT(*) AS [Total Added] FROM My_Table MT GROUP BY YEAR(MT.Created), MONTH(MT.Created) ORDER BY YEAR(MT.Created), MONTH(MT.Created) Which would return something like: YEAR MONTH

R: “Binning” categorical variables

做~自己de王妃 提交于 2019-12-10 21:46:26
问题 I have a data.frame which has 13 columns with factors. One of the columns contains credit rating data and has 54 different values: levels(TR_factor$crclscod) [1] "A" "A2" "AA" "B" "B2" "BA" "C" "C2" "C5" "CA" "CC" "CY" "D" [14] "D2" "D4" "D5" "DA" "E" "E2" "E4" "EA" "EC" "EF" "EM" "G" "GA" [27] "GY" "H" "I" "IF" "J" "JF" "K" "L" "M" "O" "P1" "TP" "U" [40] "U1" "V" "V1" "W" "Y" "Z" "Z1" "Z2" "Z4" "Z5" "ZA" "ZY" What I want is to "bin" those categories into something like levels(TR_factor

Update statement containing aggregate not working in SQL server

自作多情 提交于 2019-12-10 20:59:07
问题 I am hoping someone can help my syntax here. I have two tables ansicache..encounters and ansicache..x_refclaim_Table The encounters table has an encounter column that matches the patacctnumber column in the x_refclaim_table . However, sometimes the patacctnumber can show up twice in the x_refclaim_table with different service dates (column iar_servicedate ). I am trying to update the encounters table, admitted column to the maximum value of the iar_servicedate where the encounter in

Adding a non-aggregated column to an aggregated data set based on the aggregation of another column

…衆ロ難τιáo~ 提交于 2019-12-10 20:11:48
问题 Is it possible to use the aggregate function to add another column from the original data frame, without actually using that column to aggregate the data? This is a very simplied version of data that will help illustrate my question (let's call it data) name result.1 result.2 replicate day data.for.mean "obj.1" 1 "good" 1 1 5 "obj.1" 1 "good" 2 1 7 "obj.1" 1 "great" 1 2 6 "obj.1" 1 "good" 2 2 9 "obj.1" 2 "bad" 1 1 10 "obj.1" 2 "not good" 2 1 6 "obj.1" 2 "bad" 1 2 5 "obj.1" 2 "not good" 2 2 3

Arity of aggregate in logarithmic time

心不动则不痛 提交于 2019-12-10 19:46:18
问题 How to define arity of an aggregate in logarithmic (at least base two) compilation time (strictly speaking, in logarithmic number of instantiations)? What I can do currently is to achieve desired in a linear time: #include <type_traits> #include <utility> struct filler { template< typename type > operator type (); }; template< typename A, typename index_sequence = std::index_sequence<>, typename = void > struct aggregate_arity : index_sequence { }; template< typename A, std::size_t ...indices

aggregate with empty factor but keep row

房东的猫 提交于 2019-12-10 19:16:22
问题 I had a similar questions with by() where I accepted the fact that I had to manually replace the resulting NAs. Now I would like to aggregate my data.frame and keep the structure. e.g. My larger data set has factors for 100 countries * 10 years * 5 segments, so it should reduce to 5000 rows. But sometimes some of the segment factors are empty and i only get <5000 rows. I cannot get my head around it... My MWE still applies: #All 3 categories are used df1<-data.frame( val=rep(seq(1:4),3),

R aggregate gives differently structured results using subsets from the same data

元气小坏坏 提交于 2019-12-10 19:10:33
问题 I'm making diurnal cycles of windspeed based on a dataframe (ball) of several year's hourly data. I want to plot them by season, so I subset out the dates I need and join them like this: b8 = subset(ball, as.Date(date)>="2008-09-01 00:00:00, GMT" & as.Date(date)<= "2008-11-30 23:00:00, GMT" ) b9 = subset(ball, as.Date(date)>="2009-09-01 00:00:00, GMT" & as.Date(date)<= "2009-11-30 23:00:00, GMT" ) b10 = subset(ball, as.Date(date)>="2010-09-01 00:00:00, GMT" & as.Date(date)<= "2010-11-30 23:00