R: Reorder columns from dcast output numerically instead of lexicographically

跟風遠走 提交于 2019-12-07 11:30:11

问题


This is about ordering column names that contain both numbers and text. I have a dataframe which resulted from dcastand has 200 rows. I have a problem with the ordering.

The column names are in the following format:

names(DF) <- c('Testname1.1', 'Testname1.100','Testname1.11','Testname1.2',...,Testname2.99)

Edit: I would like to have the columns ordered as:

names(DF) <- c('Testname1.1, Testname1.2,Testname1.3,...Testname1.100,Testname2.1,...Testname 2.100)

The original input has a column which specifies the day, but it is not being used when I 'cast' the data. Is there a way to specify the 'dcast' function to order combined column names numerically?

What would be the easiest way to get the columns ordered as I need to in R?

Thanks a lot!


回答1:


I think you need to split the column before you can use it to order the data frame:

library("reshape2")  ## for colsplit()
library("gtools")

Construct test data:

dat <- data.frame(matrix(1:25,5))
names(dat) <- c('Testname1.1', 'Testname1.100',
     'Testname1.11','Testname1.2','Testname2.99')

Split and order:

cdat <- colsplit(names(dat),"\\.",c("name","num"))
dat[,order(mixedorder(cdat$name),cdat$num)]

##   Testname1.1 Testname1.2 Testname1.11 Testname1.100 Testname2.99
## 1           1          16           11             6           21
## 2           2          17           12             7           22
## 3           3          18           13             8           23
## 4           4          19           14             9           24
## 5           5          20           15            10           25

The mixedorder() above (borrowed from @BondedDust's answer) is not really necessary for this example, but would be needed if the first (Testnamexx) component had more than 9 elements, so that Testname1, Testname2, and Testname10 would come in the proper order.




回答2:


The mixedorder and mixedsort functions of pkg:gtools sometimes does what is desired but in this case I think the period separator is messing things up because it is part of numeric values. But clearly was intended go be a separator rather than decimal point. Try

    nvec <- c('Testname1.1', 'Testname1.100', 'Testname1.11', 'Testname1.2', 'Testname2.99')
#------------
> require(gtools)
Loading required package: gtools

Attaching package: ‘gtools’

The following objects are masked from ‘package:boot’:

    inv.logit, logit
#------------
myvec <- nvec[order( mixedorder( sapply(strsplit(nvec, "\\."), "[[", 1)),
                  as.numeric(sapply(strsplit(nvec, "\\."), "[[", 2))  )
              ]



回答3:


One way would be:

library(gtools) #use gtools library
library(NCmisc) #use NCmisc library for pad.left()

myvec <- c('Testname1.1', 'Testname1.100','Testname1.11','Testname1.2','Testname2.99') #construct your vector

myvec[mixedorder(  paste(substring(myvec,1,9), pad.left(substring(myvec,11,100),'0') , sep='')  ) ] 

[1] "Testname1.1"   "Testname1.2"   "Testname1.11"  "Testname1.100" "Testname2.99"


来源:https://stackoverflow.com/questions/27412741/r-reorder-columns-from-dcast-output-numerically-instead-of-lexicographically

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!