dplyr | 易学教程

How to combine two data frames using dplyr or other packages?

阅读更多关于 How to combine two data frames using dplyr or other packages?

问题 I have two data frames: df1 = data.frame(index=c(0,3,4),n1=c(1,2,3)) df1 # index n1 # 1 0 1 # 2 3 2 # 3 4 3 df2 = data.frame(index=c(1,2,3),n2=c(4,5,6)) df2 # index n2 # 1 1 4 # 2 2 5 # 3 3 6 I want to join these to: index n 1 0 1 2 1 4 3 2 5 4 3 8 (index 3 in two df, so add 2 and 6 in each df) 5 4 3 6 5 0 (index 5 not exists in either df, so set 0) 7 6 0 (index 6 not exists in either df, so set 0) The given data frames are just part of large dataset. Can I do it using dplyr or other packages

How to combine two data frames using dplyr or other packages?

阅读更多关于 How to combine two data frames using dplyr or other packages?

Access the column names in the `mutate_at` to use it for subseting a list

阅读更多关于 Access the column names in the `mutate_at` to use it for subseting a list

问题 I am trying to recode several variables but with different recode schemes. The recoding scheme is saved in a list where each element is a named vector of the form old = new . Each element is the recoding scheme for each variable in the data frame I am using the mutate_at function and the recode . I think that the problem is that I cannot extract the variable name to use it to get the correct recoding scheme from the list I tried deparse(substitute(.)) as in here and also this didn;t help Also

Access the column names in the `mutate_at` to use it for subseting a list

阅读更多关于 Access the column names in the `mutate_at` to use it for subseting a list

dplyr: Need help returning column index of first non-NA value in every row

阅读更多关于 dplyr: Need help returning column index of first non-NA value in every row

问题 I've recently started trying to do all of my code in the tidyverse. This has sometimes lead me to difficulties. Here is a simple task that I haven't been able to complete in the tidyverse: I need a column in a dataframe that returns the position index of the first non-na value from the left. Does anyone know how to achieve this in dplyr using mutate? Here is the desired output. data.frame( "X1"=c(100,rep(NA,8)), "X2"=c(NA,10,rep(NA,7)), "X3"=c(NA,NA,1000,1000,rep(NA,5)), "X4"=c(rep(NA,4),25

adding hash to each row using dplyr and digest in R

阅读更多关于 adding hash to each row using dplyr and digest in R

问题 I need to add a fingerprint to each row in a dataset so to check with a later version of the same set to look for difference. I know how to add hash for each row in R like below: data.frame(iris,hash=apply(iris,1,digest)) I am learning to use dplyr as the dataset is getting huge and I need to store them in SQL Server, I tried something like below but the hash is not working, all rows give the same hash: iris %>% rowwise() %>% mutate(hash=digest(.)) Any clue for row-wise hashing using dplyr?

Unable to subset (filter) a data frame due to NA's

阅读更多关于 Unable to subset (filter) a data frame due to NA's

问题 Why in the code below dplyr's filter doesn't return the same data.frame as base R subsetting? In fact none of them works as expected. I'd like to remove observations/rows which, simultaneously, b==1 AND c==1 . That is, I'd like to remove only the third row. require(dplyr) df <- data.frame(a=c(0,0,0,0,1,1,1), b=c(0,0,1,1,0,0,1), c=c(1,NA,1,NA,1,NA,NA)) filter(df, !(b==1 & c==1)) df[!(df$b==1 & df$c==1),] 回答1: Or use complete.cases to convert NA to FALSE in the result logic vector so that you

Unable to subset (filter) a data frame due to NA's

阅读更多关于 Unable to subset (filter) a data frame due to NA's

Unable to subset (filter) a data frame due to NA's

阅读更多关于 Unable to subset (filter) a data frame due to NA's

Divide or split dataframe into multiple dfs based on empty row and header title

阅读更多关于 Divide or split dataframe into multiple dfs based on empty row and header title

问题 I have a dataframe which has multiple values in a single file. I want to divide it into multiple files around 25 from the file. Pattern for the file is where there is one blank row and a header title is there , it is a new df. I Have tried this Splitting dataframes in R based on empty rows but this does not take care of any blank row within the new df (V1 column 9th row). I want the data to be divided on empty row and a header title my data and code i have tried is given below . Also how can