mutate

Conditionally mutate columns based on column class

随声附和 提交于 2019-12-12 15:35:08
问题 My question is based on a previous topic posted here: Mutating multiple columns in a data frame Suppose I have a tibble as follows: id char_var_1 char_var_2 num_var_1 num_var_2 ... x_var_n 1 ... ... ... ... ... 2 ... ... ... ... ... 3 ... ... ... ... ... where id is the key and char_var_x is a character variable and num_var_x is a numerical variable. I have 346 columns in total and I want to write a function that scales all the numerical variables except the id column. I'm looking for an

Using sapply on column with missing values

≡放荡痞女 提交于 2019-12-11 18:47:50
问题 I understand generally what the family of apply functions do, but I'm having trouble specifically with using it to mutate a new column based on another column with missing values. I'm able to accomplish my task with a for loop, but I want to speed up the performance by using apply type functions Say I have a time series of indices that start from today and end several years from now. My original indices only exist for the first few years. I then want to artificially extend these indices using

Elegant way to write function

旧城冷巷雨未停 提交于 2019-12-11 18:27:41
问题 I have an input column (symbols) which has more than 10000 rows and they contain operator symbols and text values like ("",">","<","","****","inv","MOD","seen") as shown below in the code as values. This column doesn't contain any numbers. It only contains the value which are stated in the code. What I would like to do is map those operator symbols ('<','>' etc) to different codes, 1) Operator_codes 2) Value_codes and have these two different codes as separate columns I already have a working

Mutate and case_when is giving NA's

风流意气都作罢 提交于 2019-12-11 17:33:52
问题 I'm getting NA returned when using dplyr s casewhen in the mutate function. I like casewhen because I don't have to use the long ifelse statements if I want to FALSE value to be the default of the original value. Is this not the point of using casewhen ? This code results in the NAs. mtcars %>% as_tibble() %>% mutate(vs = case_when(carb == 4 ~ +5)) I'd like to add 1 to vs column when values of carb are 4. Thanks. 回答1: You need to define the TRUE argument to all the remaining conditions which

Using lapply with mutate in R [duplicate]

半腔热情 提交于 2019-12-11 09:57:17
问题 This question already has answers here : Mutate multiple columns in a dataframe (6 answers) Using functions of multiple columns in a dplyr mutate_at call (2 answers) Closed last year . I am having trouble putting some code into functions/ running a loop in R I am wanting to replace variables (var1,2,3,4) in a dataframe based on the value in the 'var99' column. I am able to do this the following way: var1 = c(1, 2, 1, 2) var2 = c(3, 2, 1, 2) var3 = c(0.4, 2, 1, 2) var4 = c(1, 2, 1, 2) n1 = c

Mutate new variable on one dataframe by deriving value from another dataframe - Incompatible Issue

流过昼夜 提交于 2019-12-11 09:02:27
问题 I am using mutate to create a new column in dataframe A by fetching values from a dataframe B . I already tried using the below code but it started throwing an error. Not sure whether I am making any mistakes here. Please find the code below. Apologies, that I can't share data as it is confidential. However the objective is simple and am sure I am making a blunder somewhere. Can you correct me? Here, dfm is the dataframe which was already created, from which I will use the 'Code' column

Overwrite lot of columns with mutate_at in R?

青春壹個敷衍的年華 提交于 2019-12-11 06:17:23
问题 Given the following dataframe I am trying to mutate all but c and d columns using dplyr::mutate_at with lambda function but without luck: structure(list(a = c(1, 2, 3), b = c(43, 2, -1), c = c(234242, -223, 1), d = c(1, 1, 2)), .Names = c("a", "b", "c", "d"), row.names = c(NA, -3L), class = c("tbl_df", "tbl", "data.frame")) df %>% mutate_at(vars(-c("c", "d"), funs(x = x + rnorm(1, mean = mean(x), sd = sd(x)) I want to overwrite the existing a and b columns without the need to mutate each

Mutate value by using a value from a different row in a tibble

夙愿已清 提交于 2019-12-11 00:32:52
问题 I want to calculate the distance a node to the root dtr . All I have is a vector, that contains the parent node id for each node rel (in this example id == 7 is root): library(tidyverse) tmp <- tibble( id = 1:12, rel = c(2,7,4,2,4,5,7,7,10,8,7,7) ) In the end I'm looking for this result: tmp$dtr [1] 2 1 3 2 3 4 0 1 3 2 1 1 So far I was able to write the following algorithm until I got stuck when trying to reference a different row in my code. The algorithm should work like this (Pseudocode):

Using dplyr and conditional formatting to construct an ordinary differential equation from strings

末鹿安然 提交于 2019-12-11 00:23:04
问题 I am trying to create is a system of equations for a specific variable using dplyr and prod from a dataframe of strings to be used in an ordinary differential solver in R ( deSolve ). The location of the variable dictates the form of the equation and therefore I am using grep , filter_at , mutate_at , and apply . Constructing the equation depends on the column of the string/variable of interest based off the following i. If a variable is ever found as a product (P1,P2,P3) then multiply: +1 *

How to find if ANY column has a specific value I am looking for?

假装没事ソ 提交于 2019-12-10 13:56:34
问题 id first middle last Age 1 Carol Jenny Smith 15 2 Sarah Carol Roberts 20 3 Josh David Richardson 22 I am trying find a specific name in ANY of the name columns (first, middle, last). For example, if I found anyone with a name Carol (doesn't matter if it's the first/middle/last name), I want to mutate a column 'Carol' and give 1. So what I want is the following id first middle last Age Carol 1 Carol Jenny Smith 15 1 2 Sarah Carol Roberts 20 1 3 Josh David Richardson 22 0 I have been trying