R: enumerate column combinations of a matrix

夙愿已清 提交于 2020-01-14 14:07:22

问题


(edit note: I changed the Title to "R: enumerate column combinations of a matrix", from "R grep: matching a matrix of strings to a list" to better reflect the solution)

I am trying to match a matrix of strings to a list: so that i can ultimately use the matrix as a map in later operations on a data.frame.

This first part works as intended, returning a list of all the possible pairs, triples and quad combinations (though perhaps this approach has created my bind?):

priceList <- data.frame(aaa = rnorm(100, 100, 10), bbb = rnorm(100, 100, 10), 
            ccc = rnorm(100, 100, 10), ddd = rnorm(100, 100, 10), 
            eee = rnorm(100, 100, 10), fff = rnorm(100, 100, 10), 
            ggg = rnorm(100, 100, 10))

getTrades <- function(dd, Maxleg=3)
{
    nodes <- colnames(dd)
    tradeList <- list()
    for (i in 2:Maxleg){
        tradeLeg <- paste0('legs',i)
        tradeList[[tradeLeg]] <- combn(nodes, i)
    }
    return(tradeList)
}

tradeCombos <- getTrades(priceList, 4)

I'd now like to turn this list of possible combinations into trades. For example:

> tradeCombos[[1]][,1]
[1] "aaa" "bbb"

Needs to eventually become priceList[,2] - priceList[,1], and so forth.

I have tried a few approaches with grep and similar commands, and feel that i've come close with the following:

LocList <- sapply(tradeCombos[[1]], regexpr, colnames(priceList))

However the format is not quite suitable for the next step.

Ideally, LocList[1] would return something like: 1 2

Assuming that the tradeCombos[[1]][,1] == "aaa" "bbb".

Can someone please help?

__

With help from all of the answers below, i've now got:

colDiff <- function(x) 
{
    Reduce('-', rev(x))
}

getTrades <- function(dd, Maxleg=3)
{
    tradeList <- list()
    for (i in 2:Maxleg){
        tradeLeg <- paste0('legs',i)
        tradeLegsList <- combn(names(dd), i, 
            function(x) dd[x], simplify = FALSE)
        nameMtx <- combn(names(dd), i)
        names(tradeLegsList) <- apply(nameMtx, MARGIN=2, 
            FUN=function(x) paste(rev(x), collapse='*'))
        tradeList[[tradeLeg]] <- lapply(tradeLegsList, colDiff) 
    }
    return(tradeList)
}

tradeCombos <- getTrades(priceList, 4)

This retains the names of the constitutent parts, and is everything I was trying to achieve.

Many thanks to all for the help.


回答1:


This gets your eventual aim using lapply, apply, and Reduce.

lapply(tradeCombos, 
 function(combos) 
 apply(combos, MARGIN=2, FUN=function(combo) Reduce('-', priceList[rev(combo)])))

combo is a column from one of the combo matrices in tradeCombos. rev(combo) reverses the column so the last value is first. The R syntax for selecting a subset of columns from a data.frame is DF[col.names], so priceList[rev(combo)] is a subset of priceList with just the columns in combo, in reverse order. data.frames are actually just lists of columns, so any function that's designed to iterate over lists can be used to iterate over the columns in a data.frame. Reduce is one such function. Reduce takes a function (in this case the subtract function -) and a list of arguments and then successively calls the function on the arguments in the list with the results of the previous call, e.g., (((arg1 - arg2) - arg3) - arg4).

You rename the columns in tradeCombos so that the final column names reflect their source with:

tradeCombos <- lapply(tradeCombos, 
    function(combos) {
        dimnames(combos)[[2]] <- apply(combos, 
            MARGIN=2, 
            FUN=function(combo) paste(rev(combo), collapse='-')
        )
        return(combos)
    }
)



回答2:


Whoa... ignore everything below and jump to the update

As mentioned in my comment, you can just use combn. This solution doesn't take you to your very last step, but instead, creates a list of data.frames. From there, it is easy to use lapply to get to whatever your final step would be.

Here's the simplified function:

TradeCombos <- function(dd, MaxLeg) {
  combos = combn(names(dd), MaxLeg)
  apply(combos, 2, function(x) dd[x])
}

To use it, just specify your dataset and the number of combinations you're looking for.

TradeCombos(priceList, 3)
TradeCombos(priceList, 4)

Moving on: @mplourde has shown you how to use Reduce to successively subtract. A similar approach would be taken here:

cumDiff <- function(x) Reduce("-", rev(x))
lapply(TradeCombos(priceList, 3), cumDiff)

By keeping the output of the TradeCombos function as a list of data.frames, you'll be leaving more room for flexibility. For instance, if you wanted row sums, you can simply use lapply(TradeCombos(priceList, 3), rowSums); similar approaches can be taken for whatever function you want to apply.

Update

I'm not sure why @GSee didn't add this as an answer, but I think it's pretty awesome:

Get your list of data.frames as follows:

combn(names(priceList), 3, function(x) priceList[x], simplify = FALSE)

Advance as needed. (For example, using the cumDiff function we created: combn(names(priceList), 2, function(x) cumDiff(priceList[x]), simplify = FALSE).)




回答3:


tradeCombos is a list with matrix elements. Therefore, tradeCombos[[1]] is a matrix for which apply is more suitable.

apply(tradeCombos[[1]],1,function(x) match(x,names(priceList)))
      [,1] [,2]
 [1,]    1    2
 [2,]    1    3
 [3,]    1    4
 [4,]    1    5
 [5,]    1    6
 [6,]    1    7
 [7,]    2    3
 [8,]    2    4
 [9,]    2    5
[10,]    2    6
[11,]    2    7
[12,]    3    4
[13,]    3    5
[14,]    3    6
[15,]    3    7
[16,]    4    5
[17,]    4    6
[18,]    4    7
[19,]    5    6
[20,]    5    7
[21,]    6    7

Incidentally, you can subset using the string form anyway, eg priceList[,"aaa"]



来源:https://stackoverflow.com/questions/12368674/r-enumerate-column-combinations-of-a-matrix

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!