For each row return the column name of the largest value

前端未结

关注

 8  2458

礼貌的吻别 2020-11-21 07:06

I have a roster of employees, and I need to know at what department they are in most often. It is trivial to tabulate employee ID against department name, but it is trickier

8条回答

谎友^ (楼主)

2020-11-21 07:45

A `dplyr` solution:

Idea:

add rowids as a column
reshape to long format
filter for max in each group

Code:

DF = data.frame(V1=c(2,8,1),V2=c(7,3,5),V3=c(9,6,4))
DF %>% 
  rownames_to_column() %>%
  gather(column, value, -rowname) %>%
  group_by(rowname) %>% 
  filter(rank(-value) == 1)

Result:

# A tibble: 3 x 3
# Groups:   rowname [3]
  rowname column value
       
1 2       V1         8
2 3       V2         5
3 1       V3         9

This approach can be easily extended to get the top n columns. Example for n=2:

DF %>% 
  rownames_to_column() %>%
  gather(column, value, -rowname) %>%
  group_by(rowname) %>% 
  mutate(rk = rank(-value)) %>%
  filter(rk <= 2) %>% 
  arrange(rowname, rk)

Result:

# A tibble: 6 x 4
# Groups:   rowname [3]
  rowname column value    rk
        
1 1       V3         9     1
2 1       V2         7     2
3 2       V1         8     1
4 2       V3         6     2
5 3       V2         5     1
6 3       V3         4     2

0 讨论(0)

查看其它8个回答

For each row return the column name of the largest value

A dplyr solution:

A `dplyr` solution: