How to print the name of current row when using apply in R?

问题

For example, I have a matrix k

> k
  d e
a 1 3
b 2 4

I want to apply a function on k

> apply(k,MARGIN=1,function(p) {p+1})
a b
d 2 3
e 4 5

However, I also want to print the rowname of the row being apply so that I can know which row the function is applied on at that time.

It may looks like this:

apply(k,MARGIN=1,function(p) {print(rowname(p)); p+1})

But I really don't do how to do that in R. Does anyone has any idea?

回答1:

As far as I know you cannot do that with apply, but you could loop through the rownames of your data frame. Lame example:

lapply(rownames(mtcars), function(x) sprintf('The mpg of %s is %s.', x, mtcars[x, 1]))

回答2:

Here's a neat solution to what I think you're asking. (I've called the input matrix mat rather than k for clarity - in this example, mat has 2 columns and 10 rows, and the rows are named abc1 through to abc10.)

In the code below, the result out1 is the thing you wanted to calculate (the outcome of the apply command). The result out2 comes out identically to out1 except that it prints out the rownames that it is working on (I put in a delay of 0.3 seconds per row so you can see it really does do this - take this out when you want the code to run full speed obviously!)

The trick I came up with was to cbind the row numbers (1 to n) onto the left of mat (to create a matrix with one additional column), and then use this to refer back to the rownames of mat. Note the line x = y[-1] which means that the actual calculation within the function (here, adding 1) ignores the first column of row numbers, which means it's the same as the calculation done for out1. Whatever sort of calculation you want to perform on the rows can be done this way - just pretend that y never existed, and formulate your desired calculation using x. Hope this helps.

set.seed(1234)
mat = as.matrix(data.frame(x = rpois(10,4), y = rpois(10,4)))
rownames(mat) = paste("abc", 1:nrow(mat), sep="")
out1 = apply(mat,1,function(x) {x+1})
out2 = apply(cbind(seq_len(nrow(mat)),mat),1,
             function(y) {
                           x = y[-1]
                           cat("Doing row:",rownames(mat)[y[1]],"\n")
                           Sys.sleep(0.3)
                           x+1
                          }
            )

identical(out1,out2)

回答3:

You can use a variable outside of the apply call to keep track of the row index and pass the row names as an extra argument to your function:

idx <- 1
apply(k, 1, function(p, rn) {print(rn[idx]); idx <<- idx + 1; p + 1}, rownames(k))

回答4:

This should work. The cat() function is what you want to use when printing results during evaluation of a function. paste(), conversely, just returns a character vector but doesn't send it to the command window.

The solution below uses a counter created as a closure, allowing it to "remember" how many times the function has been run before. Note the use of the global assign <<-. If you really want to understand what's going on here, I recommend reading through this wiki https://github.com/hadley/devtools/wiki/

Note there may be an easier way to do this; my solution assumes that there is no way to access the rownumber or rowname of a current row using typical means within an apply function. As previously mentioned, this would be no problem in a loop.

k <- matrix(c(1,2,3,4),ncol=2)
rownames(k) <- c("a","b")
colnames(k) <- c("d","e")


make.counter <- function(x){
    i <- 0
    function(){
        i <<- i+1
        i   
    }
}

counter1 <- make.counter()

apply(k,MARGIN=1,function(p){
    current.row <- rownames(k)[counter1()]
    cat(current.row,"\n")
    return(p+1)
})

回答5:

Can't you simply make a new column of the row names and then reference it directly in the call to apply?

来源：https://stackoverflow.com/questions/10956873/how-to-print-the-name-of-current-row-when-using-apply-in-r

标签

statistics