lapply

Linear regression loop with data.table; “Error in data.table column or argument (nr) is NULL”

孤街浪徒 提交于 2020-01-06 05:27:14
问题 As my dataset is cumbersomely large, I would like to automate some procedures. I found this link, which proposes a linear regression loop, which for the dataset mtcars is as follows: data.table(mtcars)[, .(MyFits = lapply(.SD, function(x) if(is.numeric(x)) summary(lm(mpg ~ x)))), .SDcols = -1] I have tried to apply this onto my own dataset with limited succes. I do get the output but there is a problem. The result for some of the Fits is NULL, so when I try to do the suggested operation Fits[

R adding regression coeffcients to data frame

蹲街弑〆低调 提交于 2020-01-06 02:43:05
问题 I have a list of dataframes that contains many subsets of data (470ish). I am trying to run a regression on each of them and add the regression coefficients to a dataframe. The dataframe will contain the coefficients for all dependent variables on each subgroup. I tried iterating with a for loop but obviously that is not the right way. I think the solution has something to do with lapply? for (i in ListOfTraining){ lm(JOB_VOLUME ~ FEB+MAR+APR+MAY+JUN+JUL+AUG+SEP+OCT+NOV+DEC data

Alternatives to a for loop with indexing - R

对着背影说爱祢 提交于 2020-01-05 07:37:12
问题 I am converting unstructured data into a long format and need to create an ID (grouping) variable. I want to assign an ID variable based on sets of values contained in another variable. More specifically, consider the following data set. set.seed(1234); x.1 <- rep(letters[1:5], 10) x.2 <- sample(c(0:10), 50, replace=TRUE) x.3 <- rep(NA, 50); df <- data.frame(x.1, x.2, x.3) df <- df[-c(2, 19),] A unique case can be identified from the x.1 variable -- it starts with a and ends with e . This is

For glm2 package trying to convert factor to numeric and preserving it as a dataframe

巧了我就是萌 提交于 2020-01-05 06:52:33
问题 Use package glm2 and mlbench in R, I am testing with the BreastCancer dataset. I started with 10 columns in the data frame with 479 observations. I want to convert 9 out of 10 columns from factor to numeric and preserve the data frame but my new data frame became a data frame of 2 columns instead of the 10 ? Here is my code library(mlbench) library(glm2) data(BreastCancer) BC = na.omit(BreastCancer) BC = BC[, -1] indexes = sample(1:nrow(BC), size=0.3*nrow(BC)) BCtrain = BC[-indexes,] as

R extract a part of a string in R

微笑、不失礼 提交于 2020-01-05 04:02:22
问题 I have 5 million sequences (probes to be specific) as below. I need to extract the name from each string. The names here are 1007_s_at:123:381, 10073_s_at:128:385 and so on.. I am using lapply function but it is taking too much time. I have several other similar files. Would you suggest a faster way to do this. nm = c( "probe:HG-Focus:1007_s_at:123:381; Interrogation_Position=3570; Antisense;", "probe:HG-Focus:1007_s_at:128:385; Interrogation_Position=3615; Antisense;", "probe:HG-Focus:1007_s

Apply a function over all combinations of a list of vectors -R

徘徊边缘 提交于 2020-01-04 06:35:27
问题 I have a list of vectors and I need to apply a function to all possible combinations and express the result in a matrix, I can do that using a for loop which is inefficient in r, can anybody point out any other ways to do it, e.g using apply etc? code e.g. list <- list(c(1,2),c(3,4),c(5,6)) add_function <- function(x1,x2){ g1 <- x1[1]+x2[2] g2 <- x1[2]+x2[1] return(g1*g2) } I need to apply add_function to all possible combinations and get a 3 x 3 matrix. 回答1: We can use outer outer(seq_along

Apply a function over all combinations of a list of vectors -R

一笑奈何 提交于 2020-01-04 06:35:10
问题 I have a list of vectors and I need to apply a function to all possible combinations and express the result in a matrix, I can do that using a for loop which is inefficient in r, can anybody point out any other ways to do it, e.g using apply etc? code e.g. list <- list(c(1,2),c(3,4),c(5,6)) add_function <- function(x1,x2){ g1 <- x1[1]+x2[2] g2 <- x1[2]+x2[1] return(g1*g2) } I need to apply add_function to all possible combinations and get a 3 x 3 matrix. 回答1: We can use outer outer(seq_along

Write a list, as seen in R console output, into a text file

我是研究僧i 提交于 2020-01-03 13:39:20
问题 I have problem with writing a list into a text file in r. Here is my code: library(e1071) mydata = read.table("TRAIN.txt", sep = ",", header = FALSE) model <- naiveBayes(as.factor(V1) ~., data = my data) and I want to write the "model" into a text file. Here is the "model" format: A-priori probabilities: Y 0 1 0.703125 0.296875 Conditional probabilities: V2 Y [,1] [,2] 0 0.1327792 1.1571522 1 -0.1276267 0.9334735 V3 Y [,1] [,2] 0 -0.2414282 1.0982461 1 -0.2269481 0.7594525 and I tried the

Find variables that occur only in ONE row in R

為{幸葍}努か 提交于 2020-01-01 19:03:11
问题 Using BASE R, I wonder how to answer the following question: Are there any value on X or Y that occurs only in one row but not others? If yes, produce my desired output below. f <- data.frame(id = c(rep("AA",4), rep("BB",2), rep("CC",2)), X = c(1,2,2,3,1,4,3,3), Y = c(99,7,8,7,6,7,7,7)) Desired output: list(BB = c(X = 4, Y = 6), AA = c(Y = c(99, 8))) # $BB # X Y # 4 6 # $AA # Y1 Y2 # Would be a plus if shows `Y Y` instead of `Y1 Y2` # 99 8 回答1: There are two big ideas with this base approach:

Using lapply and read.csv on multiple files (in R)

雨燕双飞 提交于 2020-01-01 06:30:07
问题 I guess this is a bit of a beginner's question but I haven't quite found an answer or figured out what I'm doing wrong. I'm trying to read 20 CSV files that are stored in a separate directory using: setwd("./Data") filenames <- list.files() All <- lapply(filenames,function(i){ i <- paste(".\\",i,sep="") read.csv(i, header=TRUE, skip=4) }) And I get the following error: Error in file(file, "rt") : cannot open the connection In addition: Warning message: In file(file, "rt") : cannot open file '