apply | 易学教程

applying sapply or other apply function instead of nested for loop for lists of data frames

阅读更多关于 applying sapply or other apply function instead of nested for loop for lists of data frames

问题 I have two data frames/ lists of datas 'humanSplit and ratSplit` and they are of the form > ratSplit$Kidney_F_GSM1328570 ratGene ratReplicate alignment RNAtype 1 Crot Kidney_F_GSM1328570 7 REV 2 Crot Kidney_F_GSM1328570 12 REV 3 Crot Kidney_F_GSM1328570 4 REV and > humanSplit$Fetal_Brain_408_AGTCAA_L009_R1_report.txt humanGene humanReplicate alignment RNAtype 53 ZFP28 Fetal_Brain_408_AGTCAA_L009_R1_report.txt 5 reg 55 RC3H1 Fetal_Brain_408_AGTCAA_L009_R1_report.txt 9 reg 56 IFI27 Fetal_Brain

Calculate Correlations of Pairs of Columns in a Data Frame in R

阅读更多关于 Calculate Correlations of Pairs of Columns in a Data Frame in R

问题 I have the following dataframe: set.seed(1) y <- data.frame(a1 = rnorm(5) , b1 = rnorm(5), c1 = rnorm(5), a2 = rnorm(5), b2 = rnorm(5), c2 = rnorm(5)) I would like to obtain the correlations of the pairs of columns: cor(a1,a2), cor(b1,b2), cor(c1,c2) I tried the following but NA's appear as output: apply(y,2,function(x) cor(x[1],x[3])) I would like to get the result equivalent to cor(y[,1],y[,4]) cor(y[,2],y[,5]) cor(y[,3],y[,6]) In my actual data frame, I have many more pairs of columns. Any

convert factors to numeric in dataframe

阅读更多关于 convert factors to numeric in dataframe

问题 I have a very large data frame containing 2 levels of a factor, levels "No" and "Yes". I would like to replace the levels to numeric values, so that "No" turns into 0, and "Yes" turns into 1. I would like to apply a function that works on the data frame. A simple example to work on: > df a b c d 1 1 No Yes 1 2 2 No No 3 3 3 Yes No 123 4 4 Yes Yes 12 5 5 No Yes 231 6 6 No No 21 7 7 Yes No 21 8 8 Yes No 21 > str(df) 'data.frame': 8 obs. of 4 variables: $ a: int 1 2 3 4 5 6 7 8 $ b: Factor w/ 2

convert factors to numeric in dataframe

阅读更多关于 convert factors to numeric in dataframe

R: Using Apply Function to clean likert answers

阅读更多关于 R: Using Apply Function to clean likert answers

问题 I have a survey where users have filled in answers to questions on a likert scale. The software we are using has put text into the answers when all i want is numbers so an example is NEITHER AGREE NOR DISAGREE4 . I want to convert it to 4 I have R code that does this where it looks for a number in the cell of the dataframe for column 18 and extracts the number from it if it exists as.numeric(ifelse(grepl("[0-9]",df[,18]),as.numeric(gsub(x = df[,18],"[A-z]","")), df[,18])) I would like to be

How to optimise filtering and counting for every row in a large R data frame

阅读更多关于 How to optimise filtering and counting for every row in a large R data frame

问题 I have a data frame, such as the following: name day wages 1 Ann 1 100 2 Ann 1 150 3 Ann 2 200 4 Ann 3 150 5 Bob 1 100 6 Bob 1 200 7 Bob 1 150 8 Bob 2 100 For every unique name/day pair, I would like to calculate a range of totals, such as 'number of times wages was greater than 175 on current or next day for this person'. There are many more columns than wages and there are four time-slices to be applied to each total for each row. I can currently accomplish by unique'ing my data frame: df

Execute stored procedure in OUTER APPLY block

阅读更多关于 Execute stored procedure in OUTER APPLY block

问题 Why I can not use a stored procedure in OUTER APPLY block? I need to get int value from the stored procedure dbo.GetTeacherId and use this in WHERE clause. Here my code: USE [StudentsDb] DECLARE @teacherIdOut int; SELECT StudentLastName, StudentFirstName, StudentMiddleName, LessonName, Score, TLastName, TFirstName, TMiddleName FROM Scores JOIN Students ON Scores.StudentId=Students.StudentId JOIN Lessons ON Scores.LessonId=Lessons.LessonId OUTER APPLY ( EXECUTE dbo.GetTeacherId 0, 0,

Applying function row-wise in a data.table; passing column names as a vector

阅读更多关于 Applying function row-wise in a data.table; passing column names as a vector

问题 Consider a function foo as follows. foo <- function(a, b, c) { out <- (sum(a) + sqrt(prod(c))) / sqrt(pi * b) return(out) } I would like to apply the function to a data.table DT with the data in columns as arguments, row-wise according to a unique key column ID . DT <- structure(list(ID = c("K1L1", "K1L2", "K1L3", "K2L1", "K2L2", "K2L3", "K3L1", "K3L2", "K3L3", "K4L1", "K4L2", "K4L3", "K5L1", "K5L2", "K5L3"), K1 = c(0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L, 0L), K2 = c(0L, 0L,

How to transform a dataframe of characters to the respective dates?

阅读更多关于 How to transform a dataframe of characters to the respective dates?

问题 I noticed already a couple of times that working with dates doesn't allow for using the usual tricks in R. Say I have a dataframe Data with Dates (see below), and I want to convert the complete dataframe to a date class. The only solution I could come up with until now is : for (i in 1:ncol(Data)){ Data[,i] <- as.Date(Data[,i],format="%d %B %Y") } This gives a dataframe with the correct structure : > str(Data) 'data.frame': 6 obs. of 4 variables: $ Rep1:Class 'Date' num [1:6] 12898 12898

How to mutate new columns with gradually increasing sequential combinations of data?

阅读更多关于 How to mutate new columns with gradually increasing sequential combinations of data?

问题 Sample of df: df <- tibble(name = LETTERS[1:10], x = rnorm(10, mean = 10), y = rnorm(10, 10), z = rnorm(10, 10)) I would like to mutate ranked columns for x , then the sums of cols x and y , then x and y and z , where the bigger numbers are ranked 1, then the smallest numbers 10. Starting with x , I could do something like: df %<>% mutate(rank_01 = min_rank(-x)) Which computes the ranked column for x , but then I'm not sure what the best process would be to compute the latter columns. I'm