R: Sort columns of a data frame by a vector of column names

前端 未结 3 1774
失恋的感觉
失恋的感觉 2020-12-16 22:27

I have a data.frame that looks like this: \"enter

which has 1000+ columns with similar

相关标签:
3条回答
  • 2020-12-16 22:39

    Brodie's answer does exactly what you're asking for. However, you imply that your data are large, so I will provide an alternative using "data.table", which has a function called setcolorder that will change the column order by reference.

    Here's a reproducible example.

    Start with some simple data:

    mydf <- data.frame(A = 1:2, B = 3:4, C = 5:6)
    matches <- data.frame(X = 1:3, Y = c("C", "A", "B"), Z = 4:6)
    mydf
    #   A B C
    # 1 1 3 5
    # 2 2 4 6
    matches
    #   X Y Z
    # 1 1 C 4
    # 2 2 A 5
    # 3 3 B 6
    

    Provide proof that Brodie's answer works:

    out <- mydf[matches$Y]
    out
    #   C A B
    # 1 5 1 3
    # 2 6 2 4
    

    Show a more memory efficient way to do the same thing.

    library(data.table)
    setDT(mydf)
    mydf
    #    A B C
    # 1: 1 3 5
    # 2: 2 4 6
    
    setcolorder(mydf, as.character(matches$Y))
    mydf
    #    C A B
    # 1: 5 1 3
    # 2: 6 2 4
    
    0 讨论(0)
  • 2020-12-16 22:40

    A5C1D2H2I1M1N2O1R2T1's solution didn't work for my data (I've a similar problem that Yilun Zhang) so I found another option:

    mydf <- data.frame(A = 1:2, B = 3:4, C = 5:6)
    #   A B C
    # 1 1 3 5
    # 2 2 4 6
    matches <- c("B", "C", "A") #desired order
    
    mydf_reorder <- mydf[,match(matches, colnames(mydf))]
    colnames(mydf_reorder)
    #[1] "B" "C" "A"
    

    match() find the the position of first element on the second one:

    match(matches, colnames(mydf))
    #[1] 2 3 1
    

    I hope this can offer another solution if anyone is having problems!

    0 讨论(0)
  • 2020-12-16 23:02

    UPDATE, with reproducible data added by OP:

    df <- read.table(h=T, text="A    B    C
        1    2    3
        4    5    6")
    vec <- c("B", "C", "A")
    df[vec]
    

    Results in:

      B C A
    1 2 3 1
    2 5 6 4
    

    As OP desires.


    How about:

    df[df.clust$mutation_id]
    

    Where df is the data.frame you want to sort the columns of and df.clust is the data frame that contains the vector with the column order (mutation_id).

    This basically treats df as a list and uses standard vector indexing techniques to re-order it.

    0 讨论(0)
提交回复
热议问题