问题
This is my limit point and I need R professionals to help me with a quick way of looping my codes. I have a df like:
GENES <- c('RCD-7','ADF-1','BBF-10','BBF-10','BBF-10','CCF-103')
pos_1 <- c ('T','G','T','A','C','T')
pos_2 <- c ('G','T','A','A','C','G')
df <- data.frame(GENES,pos_1,pos_2)
print(df)
GENES pos_1 pos_2
RCD-7 T G
ADF-1 G T
BBF-10 T A
BBF-10 A A
BBF-10 C C
CCF-103 T G
What I do with the df is that I want to calculate the percentage of each Nucleotide (let's say alphabet) in each position (which are columns) and get the maximum percentage for each position for each GENE in the first column. I have received my desired output by writing separate lines of codes. However, my df has more than 200 rows and columns so I want to avoid keep pasting the same codes for different positions again and again.
Here are the command lines (I'm showing just for two positions) I have used to get my calculations and to get the desired output.
counts1 <- table(df$GENES, df$pos_1)
counts2 <- table(df$GENES, df$pos_2)
#
counts_df1 <- as.data.frame(unclass(counts1))
counts_df2 <- as.data.frame(unclass(counts2))
#
ordered_df1 <- tibble::rownames_to_column(counts_df1, "GENES")
ordered_df2 <- tibble::rownames_to_column(counts_df2, "GENES")
#
colnames(ordered_df1) <- c("GENES", "A1", "T1", "C1","G1")
colnames(ordered_df2) <- c("GENES", "A2", "T2", "C2", "G2")
#
ordered_df1[, c(2:4)] <- sapply(ordered_df1[, c(2:4)], as.numeric)
ordered_df2[, c(2:4)] <- sapply(ordered_df2[, c(2:4)], as.numeric)
#
final_df1 <- cbind(ordered_df1[1], prop.table(as.matrix(ordered_df1[-1]), margin = 1)*100)
final_df2 <- cbind(ordered_df2[1], prop.table(as.matrix(ordered_df2[-1]), margin = 1)*100)
#
row_max_df1 <- final_df1 %>% mutate(pos_1_max=pmax(A1, C1, G1, T1))
row_max_df2 <- final_df2 %>% mutate(pos_2_max=pmax(A2, C2, G2, T2))
#
col_combined1 <- cbind (row_max_df1[,c(1,6)],row_max_df2[,6] )
The desired output should be:
GENES pos_1_max pos_2_max
ADF-1 100.00000 100.00000
BBF-10 33.33333 66.66667
CCF-103 100.00000 100.00000
RCD-7 100.00000 100.00000
I even couldn't start writing a loop for the first two lines of my code so I would really appreciate any help.
来源:https://stackoverflow.com/questions/62130265/how-to-make-several-for-loops-to-perform-different-functions-using-r