R, relating columns to row

问题

I have five columns[each column name represents each candidate say..

can1 can2 can3 can4 can5

, each column has binary data(TRUE OR FALSE) and I have another column-CANDIDATES which has the data collection with names of the 5 candidates(factor=5)(the same candidates). so it is something like

can1 can2 can3 can4 can5 CANDIDATES

I want to create column which is binary, in which the row is true if the element of the CANDIDATE and the corresponding candidate column(in the 5 column) is true.. otherwise it must give false.

example :

can1 can2  can3   can4   can5    CANDIDATES   new_colmn

TRUE TRUE  FASLE  TRUE   FALSE   can2          TRUE
FALSE TRUE FALSE FALSE   FALSE   can4          FALSE
FALSE TRUE TRUE  FALSE   FALSE   can2          TRUE
TRUE TRUE  FALSE FALSE   TRUE    can1          TRUE

回答1:

We can use matrix indexing to create the new column:

df$new_column <- df[-ncol(df)][cbind(1:nrow(df), match(df$CANDIDATES, names(df)))]

Explanation

The function call, match(df$CANDIDATES, names(df), is a way to match the CANDIDATES column to the other column names. And 1:nrow(df) simply outputs a sequence from 1 to the last row number. Together we get:

cbind(1:nrow(df), match(df$CANDIDATES, names(df)))
     [,1] [,2]
[1,]    1    2
[2,]    2    4
[3,]    3    2
[4,]    4    1

This is a series of row, column combinations. One strength of R is the ability to subset a data frame with a two-column matrix. The first column will represent the row index, and the second column indicates the column index.

The matrix subsetting will coerce to matrix and that's okay if our input is of all the same type. That is why we subset the data frame to only the logical columns df[-ncol(df)]. That way no type conversion will occur.

Result:

df
   can1 can2  can3  can4  can5 CANDIDATES new_column
1  TRUE TRUE FASLE  TRUE FALSE       can2       TRUE
2 FALSE TRUE FALSE FALSE FALSE       can4      FALSE
3 FALSE TRUE  TRUE FALSE FALSE       can2       TRUE
4  TRUE TRUE FALSE FALSE  TRUE       can1       TRUE

回答2:

You could also use a simple mapply for this:

df$new_colmn <- 
mapply(function(x,y) {
  df[x,y]
  },
  1:nrow(df),     #row number
  df$CANDIDATES)  #corresponding candidates column

Essentially for each row (x argument) you return the corresponding candidates column (y argument).

Ouput:

> df
   can1 can2  can3  can4  can5 CANDIDATES new_colmn
1  TRUE TRUE FALSE  TRUE FALSE       can2      TRUE
2 FALSE TRUE FALSE FALSE FALSE       can4     FALSE
3 FALSE TRUE  TRUE FALSE FALSE       can2      TRUE
4  TRUE TRUE FALSE FALSE  TRUE       can1      TRUE

来源：https://stackoverflow.com/questions/33054259/r-relating-columns-to-row

标签

data-analysis