lapply

R: How to change data in a column across multiple files. Help understanding lapply

坚强是说给别人听的谎言 提交于 2019-12-25 08:27:15
问题 I have a folder with about 160 files that are formatted with three columns: onset time, variable1 'x', and variable 2 'y'. Onset is listed in R as a string, but it is a time variable which is Hour:Minute:Second:FractionalSecond. I need to remove the fractional second. If I could round that would be great, but it would be okay to just remove the fractional second using something like substr(file$onset,1,8). My files are named in a format similar to File001 File002 File054 File1001 onset X Y 00

Regression of variables in a dataframe

北慕城南 提交于 2019-12-25 06:44:59
问题 I have a dataframe: df = data.frame(x1 = rnorm(50), x2 = rnorm(50), x3 = rnorm(50), x4 = rnorm(50)) I would like to regress each variable versus all the other variables, for instance: fit1 <- lm(x1 ~ ., data = df) fit2 <- lm(x2 ~ ., data = df) etc. (Of course, the real dataframe has a lot more variables). I tried putting them in a loop, but it didn't work. I also tried using lapply but couldn't produce the desired result either. Does anyone know the trick? 回答1: You can use reformulate to

Convert a list of numeric vectors with different lengths to data.frame

自作多情 提交于 2019-12-25 01:55:27
问题 I have a df : dput(head(data)) structure(list(company_code = c(1L, 1L, 1L, 1L, 1L, 11L, 11L, 11L, 12L, 13L, 13L), company_name = c("AB Billingsfors-Långed", "AB Billingsfors-Långed", "AB Billingsfors-Långed", "AB Billingsfors-Långed", "AB Billingsfors-Långed", "AB Iggesunds B", "AB Iggesunds B", "AB Iggesunds B", "AB Industripapp", "AB Klippans FinpB", "AB Klippans FinpB" ), year_cg_code = c(11920L, 11920L, 11920L, 11920L, 11920L, 111929L, 111929L, 111929L, 121929L, 131929L, 131929L), plant

How to save and name multiple plots with R

好久不见. 提交于 2019-12-25 01:46:25
问题 I have a list of 73 data sets generated by the function "mcsv_r()" called "L1" and a function "gc()" that generates a map. Using "lapply" I can create my 73 plots. I need to save and name all of them. I know I can do it one by one with RStudio. But I am sure that thanks to "jpeg()" and "dev.off" and mixing them with a loop I can do it with a few lines of code. out <- setwd("C:/") dir(out) mcsv_r(dir(out)) gc <- function(x){ xlim <- c(-13.08, 8.68) ylim <- c(34.87, 49.50) map("world", lwd=0.05

How do I use lapply to load files into the global environment?

◇◆丶佛笑我妖孽 提交于 2019-12-25 01:45:19
问题 I have the following working code: ############################################ ###Read in all the wac gzip files########### ###Add some calculated fields ########### ############################################ library(readr) setwd("N:/Dropbox/_BonesFirst/65_GIS_Raw/LODES/") directory<-("N:/Dropbox/_BonesFirst/65_GIS_Raw/LODES/") to.readin <- as.list(list.files(pattern="2002.csv")) LEHD2002<-lapply(to.readin, function(x) { read.table(gzfile(x), header = TRUE, sep = ",", colClasses = "numeric

Using apply functions instead of for and branching statements in R

时光总嘲笑我的痴心妄想 提交于 2019-12-25 01:41:08
问题 I am using R and would like to stop using branching and for statements to take advantage of the apply functions. That being said, I have this list, x: x <- c(5,12,19,26,2,9,16,23) I would like a corresponding list as follows: for i in x if (i<=7) 1 else if (i<=14) 2 else if (i<=21) 3 else if (i<=28) 4 else 5 The final new list will be: 1,2,3,4,1,2,3,4 How can I do this with one of the apply statements? Every time I try and write one I end up scratching my head for an hour and then post a

vectorizing & parallelizing the disagregation of a list

坚强是说给别人听的谎言 提交于 2019-12-25 01:17:49
问题 Here's some code that generates a list of data.frame s and then converts that original list into a new list with each list element a list of the rows of each data frame. Eg. - l1 has length 10 and each element is a data.frame with 1000 rows. - l2 is a list of length 1000 ( nrow(l1[[k]]) ) and each element is a list of length 10 ( length(l1) ) containing row-vectors from the elements of l1 l1 <- vector("list", length= 10) set.seed(65L) for (i in 1:10) { l1[[i]] <- data.frame(matrix(rnorm(10000

How to subtract a median only from integer value

我们两清 提交于 2019-12-25 00:07:32
问题 I have this dataset df=structure(list(Dt = structure(1:39, .Label = c("2018-02-20 00:00:00.000", "2018-02-21 00:00:00.000", "2018-02-22 00:00:00.000", "2018-02-23 00:00:00.000", "2018-02-24 00:00:00.000", "2018-02-25 00:00:00.000", "2018-02-26 00:00:00.000", "2018-02-27 00:00:00.000", "2018-02-28 00:00:00.000", "2018-03-01 00:00:00.000", "2018-03-02 00:00:00.000", "2018-03-03 00:00:00.000", "2018-03-04 00:00:00.000", "2018-03-05 00:00:00.000", "2018-03-06 00:00:00.000", "2018-03-07 00:00:00

Speeding up a function

五迷三道 提交于 2019-12-24 21:25:35
问题 I want to calculate the first differences for a large panel data set. At the moment this however takes more than an hour. I am really curious to know if there are still any options left to speed up the process. As an example database: set.seed(1) DF <- data.table(panelID = sample(50,50), # Creates a panel ID Country = c(rep("A",30),rep("B",50), rep("C",20)), Group = c(rep(1,20),rep(2,20),rep(3,20),rep(4,20),rep(5,20)), Time = rep(seq(as.Date("2010-01-03"), length=20, by="1 month") - 1,5),

How to get in a specific order the results of an r lapply function with arguments from a dataframe

不问归期 提交于 2019-12-24 19:57:21
问题 Following a previous question I asked, I got an awesome answer. Here is a quick summary: I want to compute a multidimensional development index based on South Africa Data for several years. My list is composed of individual information for each year, so basically df1 is about year 1 and df2 about year2. df1<-data.frame(var1=c(1, 1,1), var2=c(0,0,1), var3=c(1,1,0)) df2<-data.frame(var1=c(1, 0,1), var2=c(1,0,1), var3=c(0,1,0)) mylist <-list (df1,df2) var1 could be the stance on religion of each