Reorganize data frame elements depending on the content of the rows in R

你说的曾经没有我的故事 提交于 2020-04-17 22:00:17

问题


I have this dataset:

df <- structure(list(V1 = c("B1D01", "B1D01", "B1D01", "B1D01", "B1D01", 
"B1D01", "U0155"), V2 = c("U0155", "U0155", "U0155", "U0155", 
"U0155", "U0155", "U3003"), V3 = c("U3003", "U3003", "C1B00", 
"U3003", "U3003", "U3003", "C1B00"), V4 = c("C1B00", "C1B00", 
"U0073", "C1B00", "C1B00", "C1B00", "P037D"), V5 = c("P037D", 
"P037D", NA, "P037D", "P037D", "P037D", "P0616"), V6 = c("P0616", 
"P0616", NA, "P0616", "P0616", "P0616", "P0562"), V7 = c("P0562", 
"P0562", NA, "P0562", "P0562", "P0562", "U0073"), V8 = c("U0073", 
"U0073", NA, "U0073", "U0073", "U0073", NA)), .Names = c("V1", 
"V2", "V3", "V4", "V5", "V6", "V7", "V8"), row.names = 1719:1725, class = "data.frame")

When I print(df):

        V1    V2    V3    V4    V5    V6    V7    V8
1719 B1D01 U0155 U3003 C1B00 P037D P0616 P0562 U0073
1720 B1D01 U0155 U3003 C1B00 P037D P0616 P0562 U0073
1721 B1D01 U0155 C1B00 U0073  <NA>  <NA>  <NA>  <NA>
1722 B1D01 U0155 U3003 C1B00 P037D P0616 P0562 U0073
1723 B1D01 U0155 U3003 C1B00 P037D P0616 P0562 U0073
1724 B1D01 U0155 U3003 C1B00 P037D P0616 P0562 U0073
1725 U0155 U3003 C1B00 P037D P0616 P0562 U0073  <NA>

As you can observe, there is a mix in these codes. For instance, U3003 is primarily in V3, but it can also be shown in V2 (last row).

I would like to reorganize this data frame with these conditions:

  • Each code might be placed in one column.
  • Names of the column should be the name of the codes.
  • If there are more codes than 8 columns, number of columns might reflect number of codes.
  • The cell values might keep the name of the codes.
  • If the code is not present in a row, NA must appear.

Be aware that my original data frame contains much more rows than this small example extracted from the original.


回答1:


The best way I found is to 'massage' the dataframe, pivoting to a longer form, and then bring it back to the initial form:

library(tidyverse)

df %>% 
  rownames_to_column() %>% 
  pivot_longer(-rowname, values_drop_na = TRUE) %>% 
  pivot_wider(rowname, names_from = value, values_from = value)

#> # A tibble: 7 x 9
#>   rowname B1D01 U0155 U3003 C1B00 P037D P0616 P0562 U0073
#>   <chr>   <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
#> 1 1719    B1D01 U0155 U3003 C1B00 P037D P0616 P0562 U0073
#> 2 1720    B1D01 U0155 U3003 C1B00 P037D P0616 P0562 U0073
#> 3 1721    B1D01 U0155 <NA>  C1B00 <NA>  <NA>  <NA>  U0073
#> 4 1722    B1D01 U0155 U3003 C1B00 P037D P0616 P0562 U0073
#> 5 1723    B1D01 U0155 U3003 C1B00 P037D P0616 P0562 U0073
#> 6 1724    B1D01 U0155 U3003 C1B00 P037D P0616 P0562 U0073
#> 7 1725    <NA>  U0155 U3003 C1B00 P037D P0616 P0562 U0073

Created on 2020-04-03 by the reprex package (v0.3.0)



来源:https://stackoverflow.com/questions/61009363/reorganize-data-frame-elements-depending-on-the-content-of-the-rows-in-r

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!