cumulative-sum

How can I create a column that cumulatively adds the sum of two previous rows based on conditions?

ⅰ亾dé卋堺 提交于 2019-12-04 21:00:00
I tried asking this question before but was it was poorly stated. This is a new attempt cause I haven't solved it yet. I have a dataset with winners, losers, date, winner_points and loser_points. For each row, I want two new columns, one for the winner and one for the loser that shows how many points they have scored so far (as both winners and losers). Example data: winner <- c(1,2,3,1,2,3,1,2,3) loser <- c(3,1,1,2,1,1,3,1,2) date <- c("2017-10-01","2017-10-02","2017-10-03","2017-10-04","2017-10-05","2017-10-06","2017-10-07","2017-10-08","2017-10-09") winner_points <- c(2,1,2,1,2,1,2,1,2)

Fetch cumulative sum from MySQL table

不打扰是莪最后的温柔 提交于 2019-12-04 19:58:09
I have a table containing donations, and I am now creating a page to view statistics. I would like to fetch monthly data from the database with gross and cumulative gross. mysql> describe donations; +------------------+------------------+------+-----+---------+----------------+ | Field | Type | Null | Key | Default | Extra | +------------------+------------------+------+-----+---------+----------------+ | id | int(10) unsigned | NO | PRI | NULL | auto_increment | | transaction_id | varchar(64) | NO | UNI | | | | donor_email | varchar(255) | NO | | | | | net | double | NO | | 0 | | | gross |

3D variant for summed area table (SAT)

岁酱吖の 提交于 2019-12-03 15:59:22
As per Wikipedia: A summed area table is a data structure and algorithm for quickly and efficiently generating the sum of values in a rectangular subset of a grid. For a 2D space a summed area table can be generated by iterating x,y over the desired range, I(x,y) = i(x,y) + I(x-1,y) + I(x,y-1) - I(x-1,y-1) And the query function for a rectangle corners A(top-left) , B(top-right) , C(bottom-right) , D can be given by:- I(C) + I(A) - I(B) - I(D) I want to convert the above to 3D. Also please tell if any other method/data structure available for calculating partial sums in 3D space. I'm not sure

GNUPLOT: saving data from smooth cumulative

冷暖自知 提交于 2019-12-03 08:45:31
I make this simple cumulative and histogram plot of a uniform random distribution of real numbers (n=1000): http://www.filedropper.com/random1_1 : random1.dat And the macro is: unset key clear reset n=120 #number of intervals max=4. #max value min=1. #min value width=(max-min)/n #interval width #function used to map a value to the intervals bin(x,width)=width*floor(x/width)+width/2.0 # cosi viene centrato in mezzo set xtics min,(max-min)/10,max set boxwidth width set style fill solid 0.5 border set ylabel 'Frequency' set y2label 'Cumulative frequency' set y2tics 0,100,1000 set ytics nomirror

Applying cumsum to binary vector

自闭症网瘾萝莉.ら 提交于 2019-12-02 12:27:30
I have a simple binary vector a which I try to translate into vector b using the R function cumsum . However, cumsum does not exactly return vector b . Here is an example: a <- c(1,0,0,0,1,1,1,1,0,0,1,0,0,0,1,1) b <- c(1,2,2,2,3,4,5,6,7,7,8,9,9,9,10,11) > cumsum(a) [1] 1 1 1 1 2 3 4 5 5 5 6 6 6 6 7 8 The problem is that whenever a 0 appears in vector a then the previous number should be increased by 1 but only for the first 0. The remaining ones are given the same value. Any advise would be great! :-) The trick is to use diff to mark the transitions: cumsum(as.logical(a+c(0,abs(diff(a))))) [1]

Cumulative value of an edge or node attribute while descending an igraph object

二次信任 提交于 2019-12-02 11:16:11
问题 I have an igraph object g made from dataframe df : df <- data.frame(c(0,1,2,2,4), c(1,2,3,4,5), c(0.01, 0.03, 0.05, 0.01, 0.02)) colnames(df) <- c('parent_id', 'id', 'dt') g <- graph_from_data_frame(df) Edges are made between parent_id and id . > g IGRAPH DN-- 6 5 -- + attr: name (v/c), dt (e/n) + edges (vertex names): [1] 0->1 1->2 2->3 2->4 4->5 Change in thickness dt is the edge attribute. This can be thought of as the change in thickness between a 'parent' and 'child' iceberg (this is my

Cumulative value of an edge or node attribute while descending an igraph object

点点圈 提交于 2019-12-02 06:51:23
I have an igraph object g made from dataframe df : df <- data.frame(c(0,1,2,2,4), c(1,2,3,4,5), c(0.01, 0.03, 0.05, 0.01, 0.02)) colnames(df) <- c('parent_id', 'id', 'dt') g <- graph_from_data_frame(df) Edges are made between parent_id and id . > g IGRAPH DN-- 6 5 -- + attr: name (v/c), dt (e/n) + edges (vertex names): [1] 0->1 1->2 2->3 2->4 4->5 Change in thickness dt is the edge attribute. This can be thought of as the change in thickness between a 'parent' and 'child' iceberg (this is my problem/project). list.edge.attributes(g) [1] "dt" to visualize: plot(g, edge.label=E(g)$dt) Example of

data.table cumulative stats of irregular observations with time window

醉酒当歌 提交于 2019-12-02 02:53:57
问题 I have some transactional records, like the following: library(data.table) customers <- 1:75 purchase_dates <- seq( as.Date('2016-01-01'), as.Date('2018-12-31'), by=1 ) n <- 500L set.seed(1) # Assume the data are already ordered and 1 row per cust_id/purch_dt df <- data.table( cust_id = sample(customers, n, replace=TRUE), purch_dt = sample(purchase_dates, n, replace=TRUE), purch_amt = sample(500:50000, n, replace=TRUE)/100 )[, .(purch_amt = sum(purch_amt)), keyby=.(cust_id, purch_dt) ] df #

Pyspark - Cumulative sum with reset condition

别说谁变了你拦得住时间么 提交于 2019-12-02 02:01:43
I have this dataframe +---+----+---+ | A| B| C| +---+----+---+ | 0|null| 1| | 1| 3.0| 0| | 2| 7.0| 0| | 3|null| 1| | 4| 4.0| 0| | 5| 3.0| 0| | 6|null| 1| | 7|null| 1| | 8|null| 1| | 9| 5.0| 0| | 10| 2.0| 0| | 11|null| 1| +---+----+---+ What I need do is a cumulative sum of values from column C until the next value is zero, then reset the cumulative sum, doing this until finish all rows. Expected output: +---+----+---+----+ | A| B| C| D| +---+----+---+----+ | 0|null| 1| 1| | 1| 3.0| 0| 0| | 2| 7.0| 0| 0| | 3|null| 1| 1| | 4| 4.0| 0| 0| | 5| 3.0| 0| 0| | 6|null| 1| 1| | 7|null| 1| 2| | 8|null| 1

data.table cumulative stats of irregular observations with time window

安稳与你 提交于 2019-12-02 01:57:33
I have some transactional records, like the following: library(data.table) customers <- 1:75 purchase_dates <- seq( as.Date('2016-01-01'), as.Date('2018-12-31'), by=1 ) n <- 500L set.seed(1) # Assume the data are already ordered and 1 row per cust_id/purch_dt df <- data.table( cust_id = sample(customers, n, replace=TRUE), purch_dt = sample(purchase_dates, n, replace=TRUE), purch_amt = sample(500:50000, n, replace=TRUE)/100 )[, .(purch_amt = sum(purch_amt)), keyby=.(cust_id, purch_dt) ] df # cust_id purch_dt purch_amt # 1 2016-03-20 69.65 # 1 2016-05-17 413.60 # 1 2016-12-25 357.18 # 1 2017-03