Zero based arrays/vectors in R

问题

Is there some way to make R use zero based indexing for vectors and other sequence data structures as is followed, for example in C and python.

We have some code that does some numerical processing in C, we are thinking of porting it over into R to make use of its advanced statistical functions, but the lack(as per my understanding after googling) of zero based index makes the task a bit more difficult.

回答1:

TL;DR: just don't do it!

I don't think the zero/one-based indexing is a major obstacle in porting your C code to R. However, if you truly believe that it is necessary to do so, you can certainly override the .Primitive('[') function, changing the behavior of the indexing/subsetting in R.

# rename the original `[`
> index1 <- .Primitive('[')

# WICKED!: override `[`. 
> `[` <- function(v, i) index1(v, i+1)
> x <- 1:5
> x[0]
[1] 1
> x[1]
[1] 2
> x[0:2]
[1] 1 2 3

However, this can be seriously dangerous because you changed the fundamental indexing behavior and can cause unexpected cascading effects for all libraries and functions that utilizes subsetting and indexing.

For example, because subsetting and indexing can accept other type of data as a selector (boolean vector, say), and the simple overriding function doesn't take that into account, you can have very strange behavior:

> x[x > 2] # x > 2 returns a boolean vector, and by + 1, you convert 
           # boolean FALSE/TRUE to numeric 0/1
[1] 1 1 2 2 2

Although this can be addressed by modifying the overriding function, you still may have other issues.

Another example:

> (idx <- which(x > 2)) # which() still gives you 1-based index
> x[idx]
[1]  4  5 NA

You never know where things might go wrong horribly. So, just don't.

回答2:

I want to develop Xin Yin's answer. It's possible to define new class (like zero-based_vector), method [ for this class and then assign this class to attributes of target vectors.

# define new method for our custom class 
index1 <- .Primitive('[')
`[.zero-based_vector` <- function(v, i) index1(as.vector(v), i+1)

x <- letters
# make `x` a zero-bazed_vector 
class(x) <- c("zero-based_vector", class(x))
# it works!
x[0:3]
# [1] "a" "b" "c" "d"

By the way, nothing will be broken, because we don't override .Primitive('[')

回答3:

You can make "auxiliary" indexes for more convenient work with 0-based constructions in R.

Here is the idea. Suppose we need to calculate y(x) = x^2 over integer x in [0; 10]:

x <- 0:10 # 0 based index for calculations
y <- c()
y[x+1] <- x^2 # have to add 1 when indexing y

x. <- x+1 # auxiliary 1 based index for R vectors indexing    
y <- c()
y[x.] <- x^2 # no need to remember to add 1

Just pick up naming pattern for auxiliary indexes that suits you: it could be x1 or x_1 - and train yourself to use it whenever you write square brackets. I decided to use the dot because it is big enough to be perceptible, but not too big to make the code messy, in my opinion.

The example above is simple, but if we need to do more complicated variable transformations like $y_i = x_{i+1}*a^{i+1} + x_i*a^i$ (for i in [0; 10]) then taking care about additional indexes pays for itself:

i <- 0:10
y[i+1] <- x[i+2]*a^(i+1) + x[i+1]*a^i # does not resemble 
                                      # the original formula

i. <- i+1
y[i.] <- x[i.+1]*a^(i+1) + x[i.]*a^i # Now it is easier to see the
                                     # original formula behind this

As you can see the code becomes clearer making it easier to compare it to the original formula when examining the code for mistakes.

回答4:

If for example you want instead matrix indices (1..3, 1..3) have indices (0..2, 0..2) for your matrix G and you want to read entries like G(0,2) and put entries like G(0,2) = 5 then one can use the following workaround:

GG      = matrix( (1:9)*0, 3, 3);
G       = function(i,j, x){
             if (missing(x)){
                return(GG[i+1, j+1]);
             }
             CC = GG;
             CC[i+1, j+1] = x;
             GG <<- CC; }

>G(0,2,5)  # for manipulate G on position (0,2) (or GG on position (1,3))               
           # and set the entry to value equal 5
> GG;
        [,1] [,2] [,3]
  [1,]    0    0    5
  [2,]    0    0    0
  [3,]    0    0    0
> G(0,2); 
[1] 5

Here G(0,2,5) is like G(0,2) = 5 and if you type G(0,2) you get the entry 5 which means you can make calculations like erg = G(0,2) + 3. In order to use indices starting with 0 instead with 1 you just shift the indices by 1 and use an auxiliary matrix GG.

回答5:

Another way would be the following (here an example for an 3x3-matrix):

n         = 3;
GG        = matrix( (1:(n+1)^2)*0, n+1, n+1);

cl        = 0;

G         = function(i,j){ return(GG[i+1, j+1]) } 

':='      = function(GAuf,x){

       cl <<- match.call();
       aa = deparse(cl$GAuf);
       d1 = gregexpr(pattern="\\(", aa)
       d2 = gregexpr(pattern=",", aa)
       s1 = d1[[1]][1] + 1;
       s2 = d2[[1]][1] - 1;
       i  = substr(aa, s1, s2);
       d1 = gregexpr(pattern=" ", aa)
       d2 = gregexpr(pattern=")", aa)
       s1 = d1[[1]][1]+1;
       s2 = d2[[1]][1]-1;
       j  = substr(aa, s1, s2);
       d  = gregexpr(pattern="\\(", aa);
       s  = d[[1]][1]-1;
       fn = substr(aa, 1, s);

       com= paste(fn,fn, "[", i,"+1,", j,"+1]","<<-x", sep="");

       eval(parse(text=com));

}

Now you can set a value by

G(0,1) := 2

and get the value by

> G(0,1)

[1] 2

Note that I used the convention that if you use a matrix named 'N' then the function ':=' expects a matrix named 'NN' (here matrix 'G' and 'GG') as you can see at the end of the function ':=' because the string 'com' is set by 'paste(fn,fn,...)'. If you want to name your matrix "hello" then you have to define a matrix "hellohello" (like matrix 'GG') and a matrix "hello" (like matrix 'G').

来源：https://stackoverflow.com/questions/25307549/zero-based-arrays-vectors-in-r

标签

indexing