multiple conditions for if else statement in R

主宰稳场 提交于 2020-04-10 06:46:45

问题


I would like to add (numerous) conditions to a loop that cycles through my data (and currently only picks the closest (not necessarily most recent) previous owner within a set distance).

Previous owners (>20,000) are stored in a dataset called lifetime_census(data available here):

previous_id  reflo  locx    locy   lifespan  census_year   gr
5587         -310    -3     10     1810      2003          A
7687         -310    -3     10.1   110       2001          A
5101         Q1      17.3   0.8    55        2004          A
9109         Q1      17.4   0.9    953       2003          B
6077         M2      13     1.8    979       2003          B
8044         M2      13.1   1.7    100       2003          A
4076         M2      13.3   1.9    790       2002          B
6130         -49     -4     9      374       2004          A
7307         B.1     2.5    1      1130      2003          A

I then have an owners dataset (data available here):

squirrel_id      spr_census reflo.x    spring_locx      spring_locy    spring_grid
6391              2005       M3           13             2.5           B
6130              2005       -310         -3             10            A
23586             2019       B9           2              9             B

To illustrate what I am trying to achieve:

squirrel_id spr_census reflo.x spring_locx spring_locy  spring_grid  previous_owner  census_year  gr
6391        2004       M3       13         2.5          B            6077            2003         B            
6130        2005       -310     -3         10           A            5587            2004         A
23586       2019       B9       2          9            B            NA              NA           NA

This scenario finds the most recent and closest previous_id at the exact reflo (or nearest previous_id within a set distance if there is no exact reflo match), this previous_id cannot be the same id as the current owner (squirrel_id, has to be from the same "city" (gr==spring_grid), and has to be from the current or past year (spr_census).

The conditions I'd like to add to the loop (in more technical terms):

  1. previous owner (lifetime_census$previous_id) cannot be current owner (owners$squirrel_id)
  2. address for previous owner needs to be from the same city (lifetime_census$gr) as current owner (owners$spring_grid)
  3. previous owner has to have lived at the same address sometime in the past or current year (lifetime_census$census_year) as current owner (owners$spr_census)

This gets me part-way there:

Calculates distances:

distance = 30

distance_xy = function (x1, y1, x2, y2) {
  sqrt((x2 - x1)^2 + (y2 -y1)^2)
}

Loop to find the previous neighbour at exact reflo, and then the next closest neighbour if there is no exact reflo match.

for (i in 1:dim(owners)[1]) {
  if (owners$reflo.x[i] %in% lifetime_census$reflo) {
    owners$previous_owner[i] = lifetime_census[lifetime_census$reflo == owners$reflo.x[i], ]$previous_id
  } else {
    dt = distance_xy(owners$spring_locx[i], owners$spring_locy[i], lifetime_census$locx, lifetime_census$locy)
      if (any(dt <= distance)) {
        owners$previous_owner[i] = lifetime_census[order(dt), ]$previous_id[1L]
      } else {
        owners$previous_id[i] = NA
      }
    }
  } 

Is there a way to pick the closest and most recent previous_id, after making sure that the previous_id is not the squirrel_id, is not from a different "city" (i.e., lifetime_census$gr==owners$spring_grid), and is not from the "future" (i.e., lifetime_census$census_year <= owners$spr_census).

I would also like to keep all the other columns associated with the previous_id, such as census_year and gr.

I find the nestedness of if-else statements particularly difficult to decipher, so please be liberal with annotations for your code suggestions.

来源:https://stackoverflow.com/questions/60625882/multiple-conditions-for-if-else-statement-in-r

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!