data.table

Using lapply to create new columns based on old columns

倾然丶 夕夏残阳落幕 提交于 2021-02-05 09:34:31
问题 My data looks as follows: DF <- structure(list(No_Adjusted_Gross_Income = c(183454, 241199, 249506 ), NoR_from_1_to_5000 = c(1035373, 4272260, 1124098), NoR_from_5000_to_10000 = c(319540, 4826042, 1959866)), row.names = c(NA, -3L), class = c("data.table", "data.frame")) val <- c(2500.5, 7500) vn <- c("AGI_from_1_to_5000", "AGI_from_5000_to_10000") No_Adjusted_Gross_Income NoR_from_1_to_5000 NoR_from_5000_to_10000 1: 183454 1035373 319540 2: 241199 4272260 4826042 3: 249506 1124098 1959866 I

Rbind list of vectors with differing lengths

邮差的信 提交于 2021-02-05 08:44:50
问题 I am new to R and I am trying to build a frequency/severity simulation. Everything is working fine except that it takes about 10min to do 10000 simulations for each of 700 locations. For the simulation of one individual location, I got a list of vectors with varying lengths and I would like to efficiently rbind these vectors, filling in NAs for all non-existing values. I would like R to return a data.frame to me. So far, I used rbind.fill.matrix after converting the vectors in the list to

Rbind list of vectors with differing lengths

怎甘沉沦 提交于 2021-02-05 08:44:26
问题 I am new to R and I am trying to build a frequency/severity simulation. Everything is working fine except that it takes about 10min to do 10000 simulations for each of 700 locations. For the simulation of one individual location, I got a list of vectors with varying lengths and I would like to efficiently rbind these vectors, filling in NAs for all non-existing values. I would like R to return a data.frame to me. So far, I used rbind.fill.matrix after converting the vectors in the list to

How to create a co-occurrence matrix calculated from combinations by ID/row in R?

♀尐吖头ヾ 提交于 2021-02-05 07:00:14
问题 Update Thanks to @jazzurro for his anwer. It made me realize that the duplicates may just complicate things. I hope by keeping only unique values/row simplifies the task.* df <- data.frame(ID = c(1,2,3,4,5), CTR1 = c("England", "England", "England", "China", "Sweden"), CTR2 = c("England", "China", "China", "England", NA), CTR3 = c("USA", "USA", "USA", "USA", NA), CTR4 = c(NA, NA, NA, NA, NA), CTR5 = c(NA, NA, NA, NA, NA), CTR6 = c(NA, NA, NA, NA, NA)) ID CTR1 CTR2 CTR3 CTR4 CTR5 CTR6 1

Find next date in series by group

纵然是瞬间 提交于 2021-02-05 05:21:30
问题 I have some data like this: sample.data <- rbind(data.table(start.date=seq(from=as.Date("2010-01-01"), to=as.Date("2014-12-01"), by="quarter"), Group=c("A","B","C","D"), rnorm(20, 5)), data.table(start.date=seq(from=as.Date("2010-01-01"), to=as.Date("2014-12-01"), by="quarter"), Group=c("A","B","C","D"), rnorm(20, 3)) ) I would like to create an end.date column that equals the next earliest start.date value for each group. So, for example, the first start.date for Group==A is 2010-01-01 . The

Find next date in series by group

自作多情 提交于 2021-02-05 05:20:33
问题 I have some data like this: sample.data <- rbind(data.table(start.date=seq(from=as.Date("2010-01-01"), to=as.Date("2014-12-01"), by="quarter"), Group=c("A","B","C","D"), rnorm(20, 5)), data.table(start.date=seq(from=as.Date("2010-01-01"), to=as.Date("2014-12-01"), by="quarter"), Group=c("A","B","C","D"), rnorm(20, 3)) ) I would like to create an end.date column that equals the next earliest start.date value for each group. So, for example, the first start.date for Group==A is 2010-01-01 . The

Non-equi join, then summarize by group

对着背影说爱祢 提交于 2021-02-04 18:08:27
问题 Here is a MWE. dta <- data.table(id=rep(1:2, each=5), seq=rep(1:5, 2), val=1:10) dtb <- data.table(id=c(1, 1, 2, 2), fil=c(2, 3, 3, 4)) dtc <- data.table(id=c(1, 1, 2, 2), mval=rep(0, 4)) for (ind in 1:4) dtc$mval[ind] <- mean( dta$val [dta$id == dtb$id[ind] & dta$seq < dtb$fil[ind]] ) dtc # id mval # 1: 1 1.0 # 2: 1 1.5 # 3: 2 6.5 # 4: 2 7.0 dtc should have the same number of rows as dtb. For every (row) ind in dtc, dtc$id[ind] = dtb$id[ind] . dtc$mval[ind] = mean(dta$val[x]) , where x is

Non-equi join, then summarize by group

牧云@^-^@ 提交于 2021-02-04 18:07:54
问题 Here is a MWE. dta <- data.table(id=rep(1:2, each=5), seq=rep(1:5, 2), val=1:10) dtb <- data.table(id=c(1, 1, 2, 2), fil=c(2, 3, 3, 4)) dtc <- data.table(id=c(1, 1, 2, 2), mval=rep(0, 4)) for (ind in 1:4) dtc$mval[ind] <- mean( dta$val [dta$id == dtb$id[ind] & dta$seq < dtb$fil[ind]] ) dtc # id mval # 1: 1 1.0 # 2: 1 1.5 # 3: 2 6.5 # 4: 2 7.0 dtc should have the same number of rows as dtb. For every (row) ind in dtc, dtc$id[ind] = dtb$id[ind] . dtc$mval[ind] = mean(dta$val[x]) , where x is

Non-equi join, then summarize by group

℡╲_俬逩灬. 提交于 2021-02-04 18:07:52
问题 Here is a MWE. dta <- data.table(id=rep(1:2, each=5), seq=rep(1:5, 2), val=1:10) dtb <- data.table(id=c(1, 1, 2, 2), fil=c(2, 3, 3, 4)) dtc <- data.table(id=c(1, 1, 2, 2), mval=rep(0, 4)) for (ind in 1:4) dtc$mval[ind] <- mean( dta$val [dta$id == dtb$id[ind] & dta$seq < dtb$fil[ind]] ) dtc # id mval # 1: 1 1.0 # 2: 1 1.5 # 3: 2 6.5 # 4: 2 7.0 dtc should have the same number of rows as dtb. For every (row) ind in dtc, dtc$id[ind] = dtb$id[ind] . dtc$mval[ind] = mean(dta$val[x]) , where x is

Non-equi join, then summarize by group

久未见 提交于 2021-02-04 18:07:48
问题 Here is a MWE. dta <- data.table(id=rep(1:2, each=5), seq=rep(1:5, 2), val=1:10) dtb <- data.table(id=c(1, 1, 2, 2), fil=c(2, 3, 3, 4)) dtc <- data.table(id=c(1, 1, 2, 2), mval=rep(0, 4)) for (ind in 1:4) dtc$mval[ind] <- mean( dta$val [dta$id == dtb$id[ind] & dta$seq < dtb$fil[ind]] ) dtc # id mval # 1: 1 1.0 # 2: 1 1.5 # 3: 2 6.5 # 4: 2 7.0 dtc should have the same number of rows as dtb. For every (row) ind in dtc, dtc$id[ind] = dtb$id[ind] . dtc$mval[ind] = mean(dta$val[x]) , where x is