I have a data set that looks like this.
bankname bankid year totass cash bond loans
Bank A 1 1881 244789 7250 20218 29513
B
Here is a data.table solution.
library(data.table)
setDT(bankdata)
bankdata[, paste0(names(bankdata)[5:7], "toAsset") :=
lapply(.SD, function(x) x/totass), .SDcols=5:7]
bankdata
# bankname bankid year totass cash bond loans cashtoAsset bondtoAsset loanstoAsset
# 1: Bank A 1 1881 244789 7250 20218 29513 0.02961734 0.082593581 0.12056506
# 2: Bank B 2 1881 195755 10243 185151 2800 0.05232561 0.945830247 0.01430359
# 3: Bank C 3 1881 107736 13357 177612 0 0.12397899 1.648585431 0.00000000
# 4: Bank D 4 1881 170600 35000 20000 5000 0.20515826 0.117233294 0.02930832
# 5: Bank E 5 1881 32000000 351266 314012 0 0.01097706 0.009812875 0.00000000
Apply and cbind
cbind(bankdata,apply(bankdata[,5:7],2, function(x) x/bankdata$totass))
names(bankdata)[8:10] <- paste0(names(bankdata)[5:7], 'toAssest’)
> bankdata
bankname bankid year totass cash bond loans cashtoAssest bondtoAssest loanstoAssest
1 Bank A 1 1881 244789 7250 20218 29513 0.02961734 0.082593581 0.12056506
2 Bank B 2 1881 195755 10243 185151 2800 0.05232561 0.945830247 0.01430359
3 Bank C 3 1881 107736 13357 177612 NA 0.12397899 1.648585431 NA
4 Bank D 4 1881 170600 35000 20000 5000 0.20515826 0.117233294 0.02930832
5 Bank E 5 1881 32000000 351266 314012 NA 0.01097706 0.009812875 NA
You might be making this a little harder than necessary. Just try this and see if it yields what you need.
bankdata$CashtoAsset <- bankdata$cash / bankdata$totass
bankdata$BondtoAsset <- bankdata$bond / bankdata$totass
bankdata$loantoAsset <- bankdata$loans / bankdata$totass
bankdata
Yields this:
bankname bankid year totass cash bond loans CashtoAsset BondtoAsset loantoAsset
1 Bank A 1 1881 244789 7250 20218 29513 0.02961734 0.082593581 0.12056506
2 Bank B 2 1881 195755 10243 185151 2800 0.05232561 0.945830247 0.01430359
3 Bank C 3 1881 107736 13357 177612 0 0.12397899 1.648585431 0.00000
4 Bank D 4 1881 170600 35000 20000 5000 0.20515826 0.117233294 0.02930832
5 Bank E 5 1881 32000000 351266 314012 0 0.01097706 0.009812875 0.00000000
This should get you started in the right direction.
This is one of the big downsides of dplyr: as far as I'm aware, there is no straightforward way to use it programmatically rather than interactively without some kind of "hack" like the deplorable eval(parse(text=foo)) idiom.
The simplest approach is the same as in the Stata method, but string manipulation is a little more verbose in R than in Stata (or in any other scripting language, for that matter).
for (x in c("cash", "bond", "loans")) {
bankdata[sprintf("%stoAsset", x)] <- bankdata[x] / bankdata$totass # or, equivalently, bankdata["totass"] for a consistent "look"
## can also replace `sprintf("%stoAsset", x)` with `paste0(c(x, "toAsset"))` or even `paste(x, "toAsset", collapse="") depending on what makes more sense to you.
}
To make the whole thing more Stata-like, you can wrap the whole thing in within like so:
bankdata <- within(bankdata, for (x in c("cash", "bond", "loans")) {
assign(x, get(x) / totass)
})
but this entails some hacking with the get and assign functions which aren't as safe to use in general, although in your case it's probably not a big deal. I wouldn't recommend trying similar tricks with dplyr, for instance, because dplyr abuses R's nonstandard evaluation features and it's probably more trouble than it's worth. For a faster and probably superior solution, check out the data.table package which (I think) would allow you to use the Stata-like looping syntax but with dplyr-like speed. Check out the package vignette on CRAN.
Also, are you really, really sure you want to reassign NA entries to 0?
There has been a change. We have been using funs() in .funs (funs(name = f(.)). But this is changed (dplyr 0.8.0 above). Instead of funs, now we use list (list(name = ~f(.))). See the following new examples.
bankdata %>%
mutate_at(.funs = list(toAsset = ~./totass), .vars = vars(cash:loans))
bankdata %>%
mutate_at(.funs = list(toAsset = ~./totass), .vars = c("cash", "bond", "loans"))
bankdata %>%
mutate_at(.funs = list(toAsset = ~./totass), .vars = 5:7)
Since I answered this question, I have realized that some SO users have been checking this answer. The dplyr package has changed since then. Therefore, I leave the following update. I hope this will help some R users to learn how to use mutate_at().
mutate_each() is now deprecated. You want to use mutate_at(), instead. You can specify which columns you want to apply your function in .vars. One way is to use vars(). Another is to use a character vector containing column names, which you want to apply your custom function in .fun. The other is to specify columns with numbers (e.g., 5:7 in this case). Note that, if you use a column for group_by(), you need to change the numbers of column positions. Have a look of this question.
bankdata %>%
mutate_at(.funs = funs(toAsset = ./totass), .vars = vars(cash:loans))
bankdata %>%
mutate_at(.funs = funs(toAsset = ./totass), .vars = c("cash", "bond", "loans"))
bankdata %>%
mutate_at(.funs = funs(toAsset = ./totass), .vars = 5:7)
# bankname bankid year totass cash bond loans cash_toAsset bond_toAsset loans_toAsset
#1 Bank A 1 1881 244789 7250 20218 29513 0.02961734 0.082593581 0.12056506
#2 Bank B 2 1881 195755 10243 185151 2800 0.05232561 0.945830247 0.01430359
#3 Bank C 3 1881 107736 13357 177612 NA 0.12397899 1.648585431 NA
#4 Bank D 4 1881 170600 35000 20000 5000 0.20515826 0.117233294 0.02930832
#5 Bank E 5 1881 32000000 351266 314012 NA 0.01097706 0.009812875 NA
I purposely gave toAsset to the custom function in .fun since this will help me to arrange new column names. Previously, I used rename(). But I think it is much easier to clean up column names with gsub() in the present approach. If the above result is saved as out, you want to run the following code in order to remove _ in the column names.
names(out) <- gsub(names(out), pattern = "_", replacement = "")
I think you can save some typing in this way with dplyr. The downside is you overwrite cash, bond, and loans.
bankdata %>%
group_by(bankname) %>%
mutate_each(funs(whatever = ./totass), cash:loans)
# bankname bankid year totass cash bond loans
#1 Bank A 1 1881 244789 0.02961734 0.082593581 0.12056506
#2 Bank B 2 1881 195755 0.05232561 0.945830247 0.01430359
#3 Bank C 3 1881 107736 0.12397899 1.648585431 NA
#4 Bank D 4 1881 170600 0.20515826 0.117233294 0.02930832
#5 Bank E 5 1881 32000000 0.01097706 0.009812875 NA
If you prefer your expected outcome, I think some typing is necessary. The renaming part seems to be something you gotta do.
bankdata %>%
group_by(bankname) %>%
summarise_each(funs(whatever = ./totass), cash:loans) %>%
rename(cashtoAsset = cash, bondtoAsset = bond, loanstoAsset = loans) -> ana;
ana %>%
merge(bankdata,., by = "bankname")
# bankname bankid year totass cash bond loans cashtoAsset bondtoAsset loanstoAsset
#1 Bank A 1 1881 244789 7250 20218 29513 0.02961734 0.082593581 0.12056506
#2 Bank B 2 1881 195755 10243 185151 2800 0.05232561 0.945830247 0.01430359
#3 Bank C 3 1881 107736 13357 177612 NA 0.12397899 1.648585431 NA
#4 Bank D 4 1881 170600 35000 20000 5000 0.20515826 0.117233294 0.02930832
#5 Bank E 5 1881 32000000 351266 314012 NA 0.01097706 0.009812875 NA
Try:
for(i in 5:7){
bankdata[,(i+3)] = bankdata[,i]/bankdata[,4]
}
names(bankdata)[(5:7)+3] = paste0(names(bankdata)[5:7], 'toAssest')
Output:
bankdata
bankname bankid year totass cash bond loans cashtoAssest bondtoAssest loanstoAssest
1 Bank A 1 1881 244789 7250 20218 29513 0.02961734 0.082593581 0.12056506
2 Bank B 2 1881 195755 10243 185151 2800 0.05232561 0.945830247 0.01430359
3 Bank C 3 1881 107736 13357 177612 0 0.12397899 1.648585431 0.00000000
4 Bank D 4 1881 170600 35000 20000 5000 0.20515826 0.117233294 0.02930832
5 Bank E 5 1881 32000000 351266 314012 0 0.01097706 0.009812875 0.00000000