R sort() data.frame

冷暖自知 提交于 2020-01-15 20:59:18

问题


I have the following data frame

head(stockdatareturnpercent)
                  SPY         DIA        IWM        SMH        OIH        
2001-04-02  8.1985485   7.8349806   7.935566  21.223832  13.975655  
2001-05-01 -0.5621328   1.7198760   2.141846 -10.904936  -4.565291  
2001-06-01 -2.6957979  -3.5838102   2.786250   4.671762 -23.241009 
2001-07-02 -1.0248091  -0.1997433  -5.725078  -3.354391  -9.161594  
2001-08-01 -6.1165559  -5.0276558  -2.461728  -6.218129 -13.956695  
2001-09-04 -8.8900629 -12.2663267 -15.760037 -39.321172 -16.902913 

Actually there are more stocks but for purposes of illustration I had to cut it down. In each month I want to know the best to worst (or worst to best) performers. I played around with the sort() function and this is what I came up with.

N <- dim(stockdatareturnpercent)[1]  
for (i in 1:N) {  
    s <- sort(stockdatareturnpercent[i,])  
    print(s)  
}  

                 UPS     FDX      XLP      XLU      XLV     DIA      IWM      SPY      XLE      XLB      XLI      OIH      XLK      SMH     MSFT
2001-04-02 0.6481585 0.93135 1.923136 4.712996 7.122751 7.83498 7.935566 8.198549 9.826701 10.13465 10.82522 13.97566 14.98789 21.22383 21.41436
                 SMH       FDX       OIH       XLK        XLE        SPY       XLU      XLP      DIA     MSFT      IWM     UPS      XLV      XLB      XLI
2001-05-01 -10.90494 -5.045544 -4.565291 -4.182041 -0.9492803 -0.5621328 0.6987724 1.457579 1.719876 2.088734 2.141846 3.73587 3.748309 3.774033 4.099748
                 OIH       XLE       XLI     XLU     XLP       XLB      DIA       UPS       SPY       XLV       FDX      XLK     IWM      SMH     MSFT
2001-06-01 -23.24101 -10.02403 -6.594324 -5.8602 -5.0532 -3.955192 -3.58381 -2.814685 -2.695798 -1.177474 0.4987542 1.935544 2.78625 4.671762 5.374764
                MSFT       OIH      XLK       IWM       SMH       XLV       UPS       XLE       SPY        XLU        XLB        XLI        DIA      FDX
2001-07-02 -9.793005 -9.161594 -7.17351 -5.725078 -3.354391 -2.016818 -1.692442 -1.159914 -1.024809 -0.9029407 -0.2723560 -0.2078283 -0.1997433 2.868898
                XLP
2001-07-02 2.998604

This is a very inefficient and cheap way to see the results. It would be nice to create an object that stores this data. However if I type 's' in the R prompt I only get the value of the last row as each subsequent iteration of the for loop replaces the previous data.

I would greatly appreciate some guidance. Thank you kindly.


回答1:


Use order() for this, as sort() drops the names when using *apply :

id <- t(apply(Data,1,order))
lapply(1:nrow(id),function(i)Data[i,id[i,]])

Using the results of order in an id matrix also allows you to do eg :

matrix(names(Data)[id],ncol=ncol(Data))
     [,1]  [,2]  [,3]  [,4]  [,5] 
[1,] "DIA" "IWM" "SPY" "OIH" "SMH"
[2,] "SMH" "OIH" "SPY" "DIA" "IWM"
[3,] "OIH" "DIA" "SPY" "IWM" "SMH"
[4,] "OIH" "IWM" "SMH" "SPY" "DIA"
[5,] "OIH" "SMH" "SPY" "DIA" "IWM"
[6,] "SMH" "OIH" "IWM" "DIA" "SPY"

To find out wich ones were the best at a given moment.

If you want to use your loop, you could use lists. as Joshua said, you overwrite s in every loop. Initialize a list to store the results first. This loop gives the same results as the above code with lapply(), but without the id matrix. There's no gain in speed, although using apply has other benefits :

N <- nrow(Data)
s <- vector("list",N)
for (i in 1:N) {
    s[[i]] <- sort(Data[i,])
}

I tested the code using following sample data (please provide your own in the future, using either this example or eg dput()) :

zz <- textConnection(" SPY         DIA        IWM        SMH        OIH
  8.1985485   7.8349806   7.935566  21.223832  13.975655
 -0.5621328   1.7198760   2.141846 -10.904936  -4.565291
 -2.6957979  -3.5838102   2.786250   4.671762 -23.241009
 -1.0248091  -0.1997433  -5.725078  -3.354391  -9.161594
 -6.1165559  -5.0276558  -2.461728  -6.218129 -13.956695
 -8.8900629 -12.2663267 -15.760037 -39.321172 -16.902913 ")

Data <- read.table(zz,header=T)
close(zz)



回答2:


Using your original code to save each sorted row in a list:

stockdatareturnpercent <- read.table(textConnection("                  SPY         DIA        IWM        SMH        OIH        
2001-04-02  8.1985485   7.8349806   7.935566  21.223832  13.975655  
2001-05-01 -0.5621328   1.7198760   2.141846 -10.904936  -4.565291  
2001-06-01 -2.6957979  -3.5838102   2.786250   4.671762 -23.241009 
2001-07-02 -1.0248091  -0.1997433  -5.725078  -3.354391  -9.161594  
2001-08-01 -6.1165559  -5.0276558  -2.461728  -6.218129 -13.956695  
2001-09-04 -8.8900629 -12.2663267 -15.760037 -39.321172 -16.902913"))

x <- vector("list", nrow(stockdatareturnpercent))

## use unlist to drop the data.frame structure
for (i in 1:nrow(stockdatareturnpercent)) {  
    x[[i]] <- sort(unlist(stockdatareturnpercent[i,])  )
} 
## use the row names to name each list element
names(x) <- rownames(stockdatareturnpercent)

x
$`2001-04-02`
  DIA       IWM       SPY       OIH       SMH 
7.834981  7.935566  8.198548 13.975655 21.223832 

$`2001-05-01`
    SMH         OIH         SPY         DIA         IWM 
-10.9049360  -4.5652910  -0.5621328   1.7198760   2.1418460 

$`2001-06-01`
   OIH        DIA        SPY        IWM        SMH 
-23.241009  -3.583810  -2.695798   2.786250   4.671762 

$`2001-07-02`
   OIH        IWM        SMH        SPY        DIA 
-9.1615940 -5.7250780 -3.3543910 -1.0248091 -0.1997433 

$`2001-08-01`
   OIH        SMH        SPY        DIA        IWM 
-13.956695  -6.218129  -6.116556  -5.027656  -2.461728 

$`2001-09-04`
   SMH        OIH        IWM        DIA        SPY 
-39.321172 -16.902913 -15.760037 -12.266327  -8.890063 

For a direct use of apply to sort each row, but does not preserve the element names:

apply(stockdatareturnpercent, 1, sort)

That returns a matrix where each column is the sorted row. Then transpose:

sortmat <- t(apply(stockdatareturnpercent, 1, sort))

If you need the result as a data.frame, as.data.frame it:

sortdf <- as.data.frame(sortmat)

Finally, all that in one line

sortdf <- as.data.frame(t(apply(stockdatareturnpercent, 1, sort)))


来源:https://stackoverflow.com/questions/5602525/r-sort-data-frame

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!