问题
(edit note: I changed the Title to "R: enumerate column combinations of a matrix", from "R grep: matching a matrix of strings to a list" to better reflect the solution)
I am trying to match a matrix of strings to a list: so that i can ultimately use the matrix as a map in later operations on a data.frame
.
This first part works as intended, returning a list of all the possible pairs, triples and quad combinations (though perhaps this approach has created my bind?):
priceList <- data.frame(aaa = rnorm(100, 100, 10), bbb = rnorm(100, 100, 10),
ccc = rnorm(100, 100, 10), ddd = rnorm(100, 100, 10),
eee = rnorm(100, 100, 10), fff = rnorm(100, 100, 10),
ggg = rnorm(100, 100, 10))
getTrades <- function(dd, Maxleg=3)
{
nodes <- colnames(dd)
tradeList <- list()
for (i in 2:Maxleg){
tradeLeg <- paste0('legs',i)
tradeList[[tradeLeg]] <- combn(nodes, i)
}
return(tradeList)
}
tradeCombos <- getTrades(priceList, 4)
I'd now like to turn this list of possible combinations into trades. For example:
> tradeCombos[[1]][,1]
[1] "aaa" "bbb"
Needs to eventually become priceList[,2] - priceList[,1]
, and so forth.
I have tried a few approaches with grep
and similar commands, and feel that i've come close with the following:
LocList <- sapply(tradeCombos[[1]], regexpr, colnames(priceList))
However the format is not quite suitable for the next step.
Ideally, LocList[1]
would return something like: 1 2
Assuming that the tradeCombos[[1]][,1] == "aaa" "bbb"
.
Can someone please help?
__
With help from all of the answers below, i've now got:
colDiff <- function(x)
{
Reduce('-', rev(x))
}
getTrades <- function(dd, Maxleg=3)
{
tradeList <- list()
for (i in 2:Maxleg){
tradeLeg <- paste0('legs',i)
tradeLegsList <- combn(names(dd), i,
function(x) dd[x], simplify = FALSE)
nameMtx <- combn(names(dd), i)
names(tradeLegsList) <- apply(nameMtx, MARGIN=2,
FUN=function(x) paste(rev(x), collapse='*'))
tradeList[[tradeLeg]] <- lapply(tradeLegsList, colDiff)
}
return(tradeList)
}
tradeCombos <- getTrades(priceList, 4)
This retains the names of the constitutent parts, and is everything I was trying to achieve.
Many thanks to all for the help.
回答1:
This gets your eventual aim using lapply
, apply
, and Reduce
.
lapply(tradeCombos,
function(combos)
apply(combos, MARGIN=2, FUN=function(combo) Reduce('-', priceList[rev(combo)])))
combo
is a column from one of the combo matrices in tradeCombos
. rev(combo)
reverses the column so the last value is first. The R
syntax for selecting a subset of columns from a data.frame
is DF[col.names]
, so priceList[rev(combo)]
is a subset of priceList
with just the columns in combo
, in reverse order. data.frame
s are actually just list
s of columns, so any function that's designed to iterate over list
s can be used to iterate over the columns in a data.frame
. Reduce
is one such function. Reduce
takes a function (in this case the subtract function -
) and a list
of arguments and then successively calls the function on the arguments in the list
with the results of the previous call, e.g., (((arg1 - arg2) - arg3) - arg4).
You rename the columns in tradeCombos
so that the final column names reflect their source with:
tradeCombos <- lapply(tradeCombos,
function(combos) {
dimnames(combos)[[2]] <- apply(combos,
MARGIN=2,
FUN=function(combo) paste(rev(combo), collapse='-')
)
return(combos)
}
)
回答2:
Whoa... ignore everything below and jump to the update
As mentioned in my comment, you can just use combn
. This solution doesn't take you to your very last step, but instead, creates a list of data.frames
. From there, it is easy to use lapply
to get to whatever your final step would be.
Here's the simplified function:
TradeCombos <- function(dd, MaxLeg) {
combos = combn(names(dd), MaxLeg)
apply(combos, 2, function(x) dd[x])
}
To use it, just specify your dataset and the number of combinations you're looking for.
TradeCombos(priceList, 3)
TradeCombos(priceList, 4)
Moving on: @mplourde has shown you how to use Reduce
to successively subtract. A similar approach would be taken here:
cumDiff <- function(x) Reduce("-", rev(x))
lapply(TradeCombos(priceList, 3), cumDiff)
By keeping the output of the TradeCombos
function as a list
of data.frame
s, you'll be leaving more room for flexibility. For instance, if you wanted row sums, you can simply use lapply(TradeCombos(priceList, 3), rowSums)
; similar approaches can be taken for whatever function you want to apply.
Update
I'm not sure why @GSee didn't add this as an answer, but I think it's pretty awesome:
Get your list
of data.frame
s as follows:
combn(names(priceList), 3, function(x) priceList[x], simplify = FALSE)
Advance as needed. (For example, using the cumDiff
function we created: combn(names(priceList), 2, function(x) cumDiff(priceList[x]), simplify = FALSE)
.)
回答3:
tradeCombos
is a list
with matrix
elements. Therefore, tradeCombos[[1]]
is a matrix
for which apply
is more suitable.
apply(tradeCombos[[1]],1,function(x) match(x,names(priceList)))
[,1] [,2]
[1,] 1 2
[2,] 1 3
[3,] 1 4
[4,] 1 5
[5,] 1 6
[6,] 1 7
[7,] 2 3
[8,] 2 4
[9,] 2 5
[10,] 2 6
[11,] 2 7
[12,] 3 4
[13,] 3 5
[14,] 3 6
[15,] 3 7
[16,] 4 5
[17,] 4 6
[18,] 4 7
[19,] 5 6
[20,] 5 7
[21,] 6 7
Incidentally, you can subset using the string form anyway, eg priceList[,"aaa"]
来源:https://stackoverflow.com/questions/12368674/r-enumerate-column-combinations-of-a-matrix