Dividing values in a column of a data frame by values from a different data frame when row values match

我们两清 提交于 2019-12-05 14:54:21

This is easy with data.table:

library(data.table)
#converting your data to the native type for the package (by reference)
setDT(x); setDT(area) 
x[area, density:=count/i.area, on="species"]

:= is the natural way to add columns in data.table (by reference, see this vignette & particularly point b) for some more about this and why it's important), so x:=y adds a column named x to your data.table and assigns it the value y.

When merging in the form X[Y,], we can think of Y as selecting the rows of X to operate on; further, when Y is a data.table, all objects in both X and Y are avaiable in j (i.e., what comes after the comma), so we could have said density:=count/area; when we want to be sure that we're referring to one of Y's columns, we prepend its name with i. so that we know we're referring to one of the columns in i, i.e., what precedes the comma. There should be a vignette on merges forthcoming.

In general, as soon as you think "match across different data sets" your instinct should be to merge. For more on data.table, see here.

I'd use a merge (left_join) then add new columns using mutate:

library(dplyr)

x %>% left_join(area, by="species") %>%
      mutate(density = count/area)
标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!