How can I divide one column of a data frame through another?

前端 未结 2 557
余生分开走
余生分开走 2020-12-13 07:35

I wanted to divide one column by another to get the per person time how can I do this?I couldn\'t find anything on how you can divide.

Here is some data that I want

相关标签:
2条回答
  • 2020-12-13 07:50

    There are a plethora of ways in which this can be done. The problem is how to make R aware of the locations of the variables you wish to divide.

    Assuming

    d <- read.table(text = "263807.0    1582
    196190.5    1016
    586689.0    3479
    ")
    names(d) <- c("min", "count2.freq")
    > d
           min count2.freq
    1 263807.0        1582
    2 196190.5        1016
    3 586689.0        3479
    

    My preferred way

    To add the desired division as a third variable I would use transform()

    > d <- transform(d, new = min / count2.freq)
    > d
           min count2.freq      new
    1 263807.0        1582 166.7554
    2 196190.5        1016 193.1009
    3 586689.0        3479 168.6373
    

    The basic R way

    If doing this in a function (i.e. you are programming) then best to avoid the sugar shown above and index. In that case any of these would do what you want

    ## 1. via `[` and character indexes
    d[, "new"] <- d[, "min"] / d[, "count2.freq"]
    
    ## 2. via `[` with numeric indices
    d[, 3] <- d[, 1] / d[, 2]
    
    ## 3. via `$`
    d$new <- d$min / d$count2.freq
    

    All of these can be used at the prompt too, but which is easier to read:

    d <- transform(d, new = min / count2.freq)
    

    or

    d$new <- d$min / d$count2.freq ## or any of the above examples
    

    Hopefully you think like I do and the first version is better ;-)

    The reason we don't use the syntactic sugar of tranform() et al when programming is because of how they do their evaluation (look for the named variables). At the top level (at the prompt, working interactively) transform() et al work just fine. But buried in function calls or within a call to one of the apply() family of functions they can and often do break.

    Likewise, be careful using numeric indices (## 2. above); if you change the ordering of your data, you will select the wrong variables.

    The preferred way if you don't need replacement

    If you are just wanting to do the division (rather than insert the result back into the data frame, then use with(), which allows us to isolate the simple expression you wish to evaluate

    > with(d, min / count2.freq)
    [1] 166.7554 193.1009 168.6373
    

    This is again much cleaner code than the equivalent

    > d$min / d$count2.freq
    [1] 166.7554 193.1009 168.6373
    

    as it explicitly states that "using d, execute the code min / count2.freq. Your preference may be different to mine, so I have shown all options.

    0 讨论(0)
  • 2020-12-13 08:01

    Hadley Wickham

    dplyr

    packages is always a saver in case of data wrangling. To add the desired division as a third variable I would use mutate()

    d <- mutate(d, new = min / count2.freq)
    
    0 讨论(0)
提交回复
热议问题