I converted a JSON file into a data.frame with a a nested list structure, which I would like to unnest and flatten. Some of the values in the list are NULL, which unnest does not accept. If I replace the NULL values with a data.frame structure that has only NA values, I get the desired result.
Below is a simplified example of my problem. I have tried to replace the NULL values with the NA data.frame but did not manage because of the the nested structure. How can I achieve the desired result?
Example
library(tidyr)
input1 <- data.frame(id = c("c", "d", "e"), value = c(7, 8, 9))
input2 <- NULL
input3 <- data.frame(id = c(NA), value = c(NA))
df <- dplyr::tibble(
a = c(1, 2),
b = list(a = input1, c = input2))
unnest(df)
gives the error "Error: Each column must either be a list of vectors or a list of data frames [b]"
df2 <- dplyr::tibble(
a = c(1, 2),
b = list(a = input1, c = input3))
unnest(df2)
gives the desired output.
We can use map_lgl
from purrr
here. If you don't care about those rows with a NULL
, you could simply remove them with filter
and unnest
:
library(tidyverse)
df %>%
filter(!map_lgl(b, is.null)) %>%
unnest()
#> # A tibble: 3 x 3
#> a id value
#> <dbl> <fctr> <dbl>
#> 1 1 c 7
#> 2 1 d 8
#> 3 1 e 9
In case you want to keep those rows, you could bring them back with right_join
after unnesting:
df %>%
filter(!map_lgl(b, is.null)) %>%
unnest() %>%
right_join(select(df, a))
#> Joining, by = "a"
#> # A tibble: 4 x 3
#> a id value
#> <dbl> <fctr> <dbl>
#> 1 1 c 7
#> 2 1 d 8
#> 3 1 e 9
#> 4 2 <NA> NA
Data
input1 <- data.frame(id = c("c", "d", "e"), value = c(7, 8, 9))
input2 <- NULL
input3 <- data.frame(id = c(NA), value = c(NA))
df <- dplyr::tibble(
a = c(1, 2),
b = list(a = input1, c = input2)
)
来源:https://stackoverflow.com/questions/47224831/using-tidyr-unnest-with-null-values