for-loop statement understanding

问题

I have probably a stupid question, but I just can not understand the logic behind it.

     Question 1:
     a = c(10,20,30)
     b = c(15,30,45)
     c = cbind(a,b)
     for ( i in 1:ncol(c))
          {d[i] = c[i]+2}

     print(d)
     # Error d not defined

Then I thought maybe I need a placeholder for the question, so I did the followings:

        Question 2: 
        d = matrix(NA, 3,2) # now have the same dimension as c         
        for ( i in 1:ncol(c)){d[i] = c[i]+2}

the output is

           [,1] [,2]
      [1,]   12   NA
      [2,]   22   NA
      [3,]   NA   NA

I cannot interpret the above output. I do not understand why I get NA in row 3.

I also tried the followings:

      for ( i in 1:ncol(c)){d[,i] = c[,i]+2}

then I get the right answer, but when I tried the following I get an error:

      Question 3: 
      for ( i in 1:ncol(c)){d[i] = c[,i]+2}
      Warning messages:
      1: In d[i] = c[, i] + 2 :
         number of items to replace is not a multiple of replacement length
      2: In d[i] = c[, i] + 2 :
         number of items to replace is not a multiple of replacement length

I know I have asked a lot of questions, but if you can answer them for a dummy , I will appreciate it a lot.

回答1:

If there were no d in your workspace it would have returned an error since assignment via indexing to not-yet defined objects is, well, ... not defined. If you had defined d as an empty numeric vector you would have gotten:

d <- numeric(0)
      for ( i in 1:ncol(c))
           {d[i] = c[i]+2}

      print(d)
#[1] "12.00" "22.00"  # since `c` at that point was a two column matrix but c[1] and c[2]
                      # are just single values because of the way matrix indexing works.

So we don't really know "why", in your case, that was the result, but we do know that d had been inside your workspace and was probably a matrix with dimensions 3x2.

You seem to be confusing the indexing of dataframes and that of matrices. if M is a matrix then M[1] is the first element same as M[1,1], which is in contrast to the case if D is a dataframe then D[1] is the entirety of first column, possibly a very long vector, unlike D[1,1] which will be length 1 as was M[1,1]. There's also a potential confusion (that you do not yet exhibit) where newcomers to R use "length" on a dataframe, and rather than returning the number of rows, instead they see only the number of columns.

回答2:

I know it is a bad practice to answer your own question, but I think I can give a very dummy answer after an extensive search.

       Question 1: 
       As previous mentioned this will give an error as followings: 
       #Error in d[i] = c[i] + 2 : object 'd' not found

Why? Because "d" does not exists, so when c[i] is equal to 1, that is 10, and adds 2, giving 12. The result cannot be placed at "d", because "d" does not exists i.e. not defined. To solve the problem, you need to define "d" and that is what I did in question 2. Defining means that before running the for-loop statement you need to create an empty place where the results can be stored, which is not the case here, hence R will produce an error.

       Question 2:
       d = matrix(NA, 3,2) # now have the same dimension as c         
       for ( i in 1:ncol(c)){d[i] = c[i]+2} 
       #[1] 12 22 NA NA NA NA

Why? Here 1:ncol(c) is equal to 1 & 2, because are only 2 columns a&b. When c[1] is equal to 10, c[2] is equal to 20. This is indexing and not columns. In other words column "a" contains (10,20,30), but c[1] is the first element of the column "a" which is 10 and c[2] is the second element which is 20 and so on. This is different from c[,1] (look at the comma before 1), which will give you a vector of column "a" i.e. (10,20,30). The code will work as followings:

        when i is equal to 1 then c will be equal to 10
        the first element of matrix d will be equal to 
        d[1] = 10 +2 
        d = 12 continuing.... now i = 2
        d[2] = 20 +2
        d= 22

since I only have columns 1 & 2 that is "i" can only be 1 & 2, the remaining output will be NA.

       Question 3: 
       for ( i in 1:ncol(c)){d[i] = c[,i]+2}
       #Warning messages:
       # 1: In d[i] = c[, i] + 2 :
         #number of items to replace is not a multiple of replacement length
       #2: In d[i] = c[, i] + 2 :
         #number of items to replace is not a multiple of replacement length
       #[1] 12 17 NA NA NA NA

Here the main issue is that c[,i] will take the elements of column 1 and column 2, but d[i] is index i.e. when "i" = 1, then d[i] will be the first element and since we only have 2 columns, then rest of the answer will be NA. To see how it works let us go through the code

     when i is equal to 1 then c[,1] = 10
     d[i] = 10 +2 = 12
     The result 12 will be placed in the first element of matrix d
     when i is equal to 2 then c[,2] = 15
     d[i] = 15+2 = 17
     The result 17 will be placed in the second element of the matrix d

So the output will look as followings:

               [,1] [,2]
         [1,]   12   NA
         [2,]   17   NA
         [3,]   NA   NA

Also notice the placement of the number 17. This the result of first element of column 2, but it is placed in the second element of matrix d. This is very dangerous.

来源：https://stackoverflow.com/questions/36411700/for-loop-statement-understanding

标签

for-loop