lapply | 易学教程

Counting the number of rows of a series of csv files

阅读更多关于 Counting the number of rows of a series of csv files

I'm working through an R tutorial and suspect that I have to use one of these functions but I'm not sure which (Yes I researched them but until I become more fluent in R terminology they are quite confusing). In my working directory there is a folder "specdata". Specdata contains hundreds of CSV files named 001.csv - 300.csv. The function I am working on must count the total number of rows for an inputed number of csv files. So if the argument in the function is 1:10 and each of those files has ten rows, return 100. Here's what I have so far: complete <- function(directory,id = 1:332) {

Performing loops on list of lists of rasters

阅读更多关于 Performing loops on list of lists of rasters

Need solution, help will be much appreciated. In the following code I am creating three rasters. I then create a random number of point locations on this raster and I am receiving a list of three matrices with coordinates of those random locations called samples . I then take those locations and sample raster values to receive samplevalues . What I want to change is that I want to create a set of 100,150,200 and 250 random point locations ( numberv ). So after generating these locations and receiving a list of locations, each raster will be sampled length(numberv) times (in this case 4 times).

calculation of 90 percentile and replacement of it by median by groups in R

阅读更多关于 calculation of 90 percentile and replacement of it by median by groups in R

问题 Here part of data. mydat=structure(list(code = c(123L, 123L, 123L, 123L, 123L, 123L, 123L, 123L, 123L, 123L, 123L, 123L, 123L, 123L, 123L, 123L, 123L, 123L, 123L, 123L, 123L, 123L, 123L, 123L, 123L, 123L, 123L, 123L, 123L, 123L, 123L, 123L, 123L, 123L, 123L, 123L, 123L, 123L, 123L, 123L, 123L, 123L, 123L, 123L, 123L, 123L, 123L, 222L, 222L, 222L, 222L, 222L, 222L, 222L, 222L, 222L, 222L, 222L, 222L, 222L, 222L, 222L, 222L, 222L, 222L, 222L, 222L, 222L, 222L, 222L, 222L, 222L, 222L, 222L, 222L

Can lapply pass (to a function) values stored in a vector, successively

阅读更多关于 Can lapply pass (to a function) values stored in a vector, successively

问题 I need lapply to pass (to a function) values stored in a vector, successively. values <- c(10,11,13,10) lapply(foo,function(x) peakabif(x,npeaks=values)) So to get : peakabif(x1,npeaks=10) peakabif(x2,npeaks=11) peakabif(x3,npeaks=13) peakabif(x4,npeaks=10) Is this possible or do I need to reconsider using lapply ? Is a for loop inside the function would work ? 回答1: You want to use mapply for this: mapply(peakabif, x=foo, npeaks=values) 回答2: There are a couple of ways to handle this. You

Efficient sampling from nested lists

阅读更多关于 Efficient sampling from nested lists

I have a list of lists , containing data.frames, from which I want to select only a few rows . I can achieve it in a for-loop, where I create a sequence based on the amount of rows and select only row indices according to that sequence. But if I have deeper nested lists it doesn't work anymore. I am also sure, that there is a better way of doing that without a loop. What would be an efficient and generic approach to sample from nested lists, that vary in their dimensions and contain data.frames or matrices? ## Dummy Data n1=100;n2=300;n3=100 crdOrig <- list( list(data.frame(x = runif(n1,10,20)

Use R and Openxlsx to output a list of dataframes as worksheets in a single Excel file

阅读更多关于 Use R and Openxlsx to output a list of dataframes as worksheets in a single Excel file

I have a set of CSV files. I want to package them up and export the data to a single Excel file that contains multiple worksheets. I read in the CSV files as a set of data frames. My problem is how to construct the command in openxlsx , I can do it manually, but I am having a list construction issue. Specifically how to add a data frame as a subcomponent of a named list and then pass as a parameter to write.xlsx() Example Ok, so I first list the CSV files on disk and generate a set of data frames in memory... # Generate a list of csv files on disk and shorten names... filePath <- "..

Separate contents of field

阅读更多关于 Separate contents of field

问题 I'm sure this is very simple, and I think it's a case of using separate and gather. I have a single field in a dataframe, authorlist,an edited export of a pubmed search. It contains the authors of the publications. It can, obviously, contain either a single author or a collaboration of authors. For example this is just a selection of the options available: Author Drijgers RL, Verhey FR, Leentjens AF, Kahler S, Aalten P. What I'd like to do is create a single list of ALL authors so that I'd

Concatenate column names in data.table based on conditions [duplicate]

阅读更多关于 Concatenate column names in data.table based on conditions [duplicate]

问题 This question already has answers here : Get column names where dat is equal to (3 answers) Closed 2 years ago . This is what my data.table looks like. The rightmost column PASTE is my desired column. library(data.table) dt <- fread(' A B C PASTE TRUE FALSE TRUE A,C TRUE TRUE TRUE A;B;C FALSE TRUE FALSE B FALSE FALSE FALSE ') I am trying to create the column PASTE by concatenating all the column names as long as the value in that row for that column is TRUE. This is my attempt: dt[,PASTE:= if

R speed up the for loop using apply() or lapply() or etc

阅读更多关于 R speed up the for loop using apply() or lapply() or etc

问题 I wrote a special "impute' function that replaces the column values that have missing (NA) values with either mean() or mode() based on the specific column name. The input dataframe is 400,000+ rows and its vert slow , how can i speed up the imputation part using lapply() or apply(). Here is the function , mark section I want optimized with START OPTIMIZE & END OPTIMIZE: specialImpute <- function(inputDF) { discoveredDf <- data.frame(STUDYID_SUBJID=character(), stringsAsFactors=FALSE) dfList

R - iteratively apply a function of a list of variables

阅读更多关于 R - iteratively apply a function of a list of variables

问题 My goal is to create a function that, when looped over multiple variables of a data frame, will return a new data frame containing the percents and 95% confidence intervals for each level of each variable. As an example, if I applied this function to "cyl" and "am" from the mtcars data frame, I would want this as the final result: variable level ci.95 1 cyl 4 34.38 (19.50, 53.11) 2 cyl 6 21.88 (10.35, 40.45) 3 cyl 8 43.75 (27.10, 61.94) 4 am 0 59.38 (40.94, 75.5) 5 am 1 40.62 (24.50, 59.06)