R function for normalization based on one column?

问题

Is it possible to normalize this table in R based on the last column(samples) samples = number of sequenced genomes. So I want to get a normalised distribution of all the genes in all the conditions.

Simplified example of my data:

I tried:

dat1 <- read.table(text = " gene1   gene2   gene3   samples 
condition1  1   1   8   120
condition2  18  4   1   118
condition3  0   0   1   75
condition4  32  1   1   130", header = TRUE)

dat1<-normalize(dat1, method = "standardize", range = c(0, 1), margin = 1L, on.constant = "quiet")

But the results include negative values and I am not sure how useful this approach is. Can anyone please suggest how I should normalize my data ... to get meaningful results.

Thanks a lot and apologies if it is a dumb question.

回答1:

Using your data, you write a min max function first:

minmax = function(x){ (x-min(x))/(max(x)-min(x))}

Then iterate through the columns:

norm = data.frame(lapply(dat1[,1:3],function(i) minmax(i/dat1$samples)))

And it looks like this, I hope it's correct:

       gene1     gene2      gene3
1 0.03385417 0.2458333 1.00000000
2 0.61970339 1.0000000 0.01326455
3 0.00000000 0.0000000 0.09565217
4 1.00000000 0.2269231 0.00000000

来源：https://stackoverflow.com/questions/64873299/r-function-for-normalization-based-on-one-column

标签

normalization

standardized

易学教程内所有资源均来自网络或用户发布的内容，如有违反法律规定的内容欢迎反馈！
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!