data-manipulation

Replacing NAs in a column with the values of other column

一世执手 提交于 2019-12-03 06:48:21
I wonder how to replace NA s in a column with the values of other column in R using dplyr . MWE is below. Letters <- LETTERS[1:5] Char <- c("a", "b", NA, "d", NA) df1 <- data.frame(Letters, Char) df1 library(dplyr] df1 %>% mutate(Char1 = ifelse(Char != NA, Char, Letters)) Letters Char Char1 1 A a NA 2 B b NA 3 C <NA> NA 4 D d NA 5 E <NA> NA You can use coalesce : library(dplyr) df1 <- data.frame(Letters, Char, stringsAsFactors = F) df1 %>% mutate(Char1 = coalesce(Char, Letters)) Letters Char Char1 1 A a a 2 B b b 3 C <NA> C 4 D d d 5 E <NA> E 来源: https://stackoverflow.com/questions/46137115

How to run tapply() on multiple columns of data frame using R?

放肆的年华 提交于 2019-12-03 05:47:10
问题 I have a data frame like the following: a b1 b2 b3 b4 b5 b6 b7 b8 b9 D 4 6 9 5 3 9 7 9 8 F 7 3 8 1 3 1 4 4 3 R 2 5 5 1 4 2 3 1 6 D 9 2 1 4 3 3 8 2 5 D 5 4 3 1 6 4 1 8 3 R 3 7 9 1 8 5 3 4 2 D 4 1 8 2 6 3 2 7 5 F 7 1 7 2 7 1 6 2 4 D 6 3 9 3 9 9 7 1 2 The function tapply(df[,2], INDEX = df$a, sum) works fine to produce a table that sums everything in df[,2] by df$a, but when I try tapply(df[,2:10], INDEX = df$a, sum) to get a similar table, except with a sum for each column (2, 3, 4,..., 10), I

Arranging rows in custom order using dplyr

一个人想着一个人 提交于 2019-12-03 03:39:49
问题 With arrange function in dplyr , we can arrange row in ascending or descending order. Wonder how to arrange rows in custom order. Please see MWE. Reg <- rep(LETTERS[1:3], each = 2) Res <- rep(c("Urban", "Rural"), times = 3) set.seed(12345) Pop <- rpois(n = 6, lambda = 500000) df <- data.frame(Reg, Res, Pop) df Reg Res Pop 1 A Urban 500414 2 A Rural 500501 3 B Urban 499922 4 B Rural 500016 5 C Urban 501638 6 C Rural 499274 df %>% arrange() Desired Output Reg Res Pop 5 C Urban 501638 6 C Rural

pandas reset_index after groupby.value_counts()

匆匆过客 提交于 2019-12-02 23:54:31
I am trying to groupby a column and compute value counts on another column. import pandas as pd dftest = pd.DataFrame({'A':[1,1,1,1,1,1,1,1,1,2,2,2,2,2], 'Amt':[20,20,20,30,30,30,30,40, 40,10, 10, 40,40,40]}) print(dftest) dftest looks like A Amt 0 1 20 1 1 20 2 1 20 3 1 30 4 1 30 5 1 30 6 1 30 7 1 40 8 1 40 9 2 10 10 2 10 11 2 40 12 2 40 13 2 40 perform grouping grouper = dftest.groupby('A') df_grouped = grouper['Amt'].value_counts() which gives A Amt 1 30 4 20 3 40 2 2 40 3 10 2 Name: Amt, dtype: int64 what I want is to keep top two rows of each group Also, I was perplexed by an error when I

How to run tapply() on multiple columns of data frame using R?

末鹿安然 提交于 2019-12-02 20:26:27
I have a data frame like the following: a b1 b2 b3 b4 b5 b6 b7 b8 b9 D 4 6 9 5 3 9 7 9 8 F 7 3 8 1 3 1 4 4 3 R 2 5 5 1 4 2 3 1 6 D 9 2 1 4 3 3 8 2 5 D 5 4 3 1 6 4 1 8 3 R 3 7 9 1 8 5 3 4 2 D 4 1 8 2 6 3 2 7 5 F 7 1 7 2 7 1 6 2 4 D 6 3 9 3 9 9 7 1 2 The function tapply(df[,2], INDEX = df$a, sum) works fine to produce a table that sums everything in df[,2] by df$a, but when I try tapply(df[,2:10], INDEX = df$a, sum) to get a similar table, except with a sum for each column (2, 3, 4,..., 10), I get an error message reading: Error in tapply(df[, 2:10], INDEX = df$a, sum) : arguments must have same

Arranging rows in custom order using dplyr

自古美人都是妖i 提交于 2019-12-02 18:34:02
With arrange function in dplyr , we can arrange row in ascending or descending order. Wonder how to arrange rows in custom order. Please see MWE. Reg <- rep(LETTERS[1:3], each = 2) Res <- rep(c("Urban", "Rural"), times = 3) set.seed(12345) Pop <- rpois(n = 6, lambda = 500000) df <- data.frame(Reg, Res, Pop) df Reg Res Pop 1 A Urban 500414 2 A Rural 500501 3 B Urban 499922 4 B Rural 500016 5 C Urban 501638 6 C Rural 499274 df %>% arrange() Desired Output Reg Res Pop 5 C Urban 501638 6 C Rural 499274 1 A Urban 500414 2 A Rural 500501 3 B Urban 499922 4 B Rural 500016 We can use factor to change

How to acquire complete list of subdirs (including subdirs of subdirs)?

送分小仙女□ 提交于 2019-12-02 18:09:56
问题 I have thousands of city folders (for example city1 , city2 , and so on, but in reality named like NewYork , Boston , etc.). Each folder further contains two subfolders: land and house . So the directory structure is like: current dictionary ---- city1 ----- house ------ many .xlsx files ----- land ----- city2 ----- city3 ··· ----- city1000 I want to get the complete list of all subdirs and do some manipulation (like import excel ). I know there is a macro extended function: local list: dir

Apply a recursive function to a nested list while preserving the classes of sublists

六月ゝ 毕业季﹏ 提交于 2019-12-02 08:14:18
问题 I have a nested list called inputs : library(htmltools) library(shiny) inputs = tagList( selectInput('first', 'FIRST', letters), checkboxInput('second', 'SECOND') ) str(inputs, max.level = 1) List of 2 $ :List of 3 ..- attr(*, "class")= chr "shiny.tag" ..- attr(*, "html_dependencies")=List of 1 $ :List of 3 ..- attr(*, "class")= chr "shiny.tag" - attr(*, "class")= chr [1:2] "shiny.tag.list" "list" I would like to modify all sublists who have class shiny.tag and whose name element equals label

How to acquire complete list of subdirs (including subdirs of subdirs)?

蹲街弑〆低调 提交于 2019-12-02 08:00:41
I have thousands of city folders (for example city1 , city2 , and so on, but in reality named like NewYork , Boston , etc.). Each folder further contains two subfolders: land and house . So the directory structure is like: current dictionary ---- city1 ----- house ------ many .xlsx files ----- land ----- city2 ----- city3 ··· ----- city1000 I want to get the complete list of all subdirs and do some manipulation (like import excel ). I know there is a macro extended function: local list: dir to handle this issue, but it seems it can only return the first tier of subdirs, like city_i , rather

Replace each element equal to zero of a matrix with the corresponding element of the row above

时间秒杀一切 提交于 2019-12-02 01:42:51
问题 I'm using R. I have a matrix and I want to replace each element of it equal to zero with the corresponding element of the row above. For example, I created the following matrix: AA <- matrix(c(1,2,3,1,4,5,1,0,2), ncol=3, nrow=3) [,1] [,2] [,3] [1,] 1 1 1 [2,] 2 4 0 [3,] 3 5 2 I want to replace 0 with the element AA[1,3]. I would like a function able of doing this for each element of a matrix. 回答1: We could find the row/column index of elements that are 0 in the matrix ('i1'), then extract the