Manipulate a data frame where there are multiple colums for each experiment

前端 未结 1 1734
情深已故
情深已故 2020-12-21 13:25

I have many sequencing experiments each with multiple results for each of a few hundred genes, when the data is outputted from another programme it isn\'t in a useful format

相关标签:
1条回答
  • 2020-12-21 14:17

    This might be a shorter way to do it:

    pp.new <- as.data.frame(t(pp)[-1,], row.names = 1)
    names(pp.new) <- c("experiment", "part", "gene1", "gene2", "gene3", "gene4")
    

    which gives:

    > pp.new
       experiment   part gene1 gene2 gene3 gene4
    1 Experiment1 Part 1     a     b     c     d
    2 Experiment1 Part 2     e     f     g     h
    3 Experiment2 Part 1     i     j     k     l
    4 Experiment2 Part 2     m     n     o     p
    

    However, it is probably better to transform this into long format with the reshape2 package:

    library(reshape2)    
    pp.long <- melt(pp.new, id=c("experiment","part"))
    

    which results in:

    > pp.long
        experiment   part variable value
    1  Experiment1 Part 1    gene1     a
    2  Experiment1 Part 2    gene1     e
    3  Experiment2 Part 1    gene1     i
    4  Experiment2 Part 2    gene1     m
    5  Experiment1 Part 1    gene2     b
    6  Experiment1 Part 2    gene2     f
    7  Experiment2 Part 1    gene2     j
    8  Experiment2 Part 2    gene2     n
    9  Experiment1 Part 1    gene3     c
    10 Experiment1 Part 2    gene3     g
    11 Experiment2 Part 1    gene3     k
    12 Experiment2 Part 2    gene3     o
    13 Experiment1 Part 1    gene4     d
    14 Experiment1 Part 2    gene4     h
    15 Experiment2 Part 1    gene4     l
    16 Experiment2 Part 2    gene4     p
    

    If you want to get a compareable output as in x3, you can use the recast function (also from the reshape2 package):

    recast(pp.new, part + variable ~ experiment, id.var=c("experiment","part"), value.var = "value")
    

    which gives:

        part variable Experiment1 Experiment2
    1 Part 1    gene1           a           i
    2 Part 1    gene2           b           j
    3 Part 1    gene3           c           k
    4 Part 1    gene4           d           l
    5 Part 2    gene1           e           m
    6 Part 2    gene2           f           n
    7 Part 2    gene3           g           o
    8 Part 2    gene4           h           p
    
    0 讨论(0)
提交回复
热议问题