Given a list of lists my goal is to reverse its structure (R language).
So, I want to bring the elements of the nested lists to be el
How about this simple solution, which is completely general, and almost as fast as Josh O'Brien's original answer that assumed common internal names (#4).
zv <- unlist(unname(z), recursive=FALSE)
ans <- split(setNames(zv, rep(names(z), lengths(z))), names(zv))
And here is a general version that is robust to not having names:
invertList <- function(z) {
zv <- unlist(unname(z), recursive=FALSE)
zind <- if (is.null(names(zv))) sequence(lengths(z)) else names(zv)
if (!is.null(names(z)))
zv <- setNames(zv, rep(names(z), lengths(z)))
ans <- split(zv, zind)
if (is.null(names(zv)))
ans <- unname(ans)
ans
}
reshape can get you close,
library(reshape)
b = recast(z, L2~L1)
split(b[,-1], b$L2)
The problem was that do.call rbind was not calling rbind.data.frame
which does some matching of names. rbind.data.frame
should work, because data.frames are lists and each sublist is a list, so we could just call it directly.
apply(do.call(rbind.data.frame, z), 1, as.list)
However, while this may be succicint, it is slow because do.call(rbind.data.frame, ...)
is inherently slow.
Something like (in two steps)
# convert each component of z to a data.frame
# so rbind.data.frame so named elements are matched
x <- data.frame((do.call(rbind, lapply(z, data.frame))))
# convert each column into an appropriately named list
o <- lapply(as.list(x), function(i,nam) as.list(`names<-`(i, nam)), nam = rownames(x))
o
$a
$a$z1
[1] 1
$a$z2
[1] 1
$b
$b$z1
[1] 2
$b$z2
[1] 4
$c
$c$z1
[1] 3
$c$z2
[1] 0
And an alternative
# unique names
nn <- Reduce(unique,lapply(z, names))
# convert from matrix to list `[` used to ensure correct ordering
as.list(data.frame(do.call(rbind,lapply(z, `[`, nn))))
I'd like to add a further solution to this valuable collection (to which I have turned many times):
revert_list_str_9 <- function(x) do.call(Map, c(c, x))
If this were code golf, we'd have a clear winner! Of course, this requires the individual list entries to be in the same order. This can be extended, using various ideas from above, such as
revert_list_str_10 <- function(x) {
nme <- names(x[[1]]) # from revert_list_str_4
do.call(Map, c(c, lapply(x, `[`, nme)))
}
revert_list_str_11 <- function(x) {
nme <- Reduce(unique, lapply(x, names)) # from revert_list_str_3
do.call(Map, c(c, lapply(x, `[`, nme)))
}
Performance-wise it's also not too shabby. If stuff is properly sorted, we have a new base R solution to beat. If not, timings still are very competitive.
z <- list(z1 = list(a = 1, b = 2, c = 3), z2 = list(b = 4, a = 1, c = 0))
microbenchmark::microbenchmark(
revert_list_str_1(z), revert_list_str_2(z), revert_list_str_3(z),
revert_list_str_4(z), revert_list_str_5(z), revert_list_str_7(z),
revert_list_str_9(z), revert_list_str_10(z), revert_list_str_11(z),
times = 1e3
)
#> Unit: microseconds
#> expr min lq mean median uq max
#> revert_list_str_1(z) 51.946 60.9845 67.72623 67.2540 69.8215 1293.660
#> revert_list_str_2(z) 461.287 482.8720 513.21260 490.5495 498.8110 1961.542
#> revert_list_str_3(z) 80.180 89.4905 99.37570 92.5800 95.3185 1424.012
#> revert_list_str_4(z) 19.383 24.2765 29.50865 26.9845 29.5385 1262.080
#> revert_list_str_5(z) 499.433 525.8305 583.67299 533.1135 543.4220 25025.568
#> revert_list_str_7(z) 56.647 66.1485 74.53956 70.8535 74.2445 1309.346
#> revert_list_str_9(z) 6.128 7.9100 11.50801 10.2960 11.5240 1591.422
#> revert_list_str_10(z) 8.455 10.9500 16.06621 13.2945 14.8430 1745.134
#> revert_list_str_11(z) 14.953 19.8655 26.79825 22.1805 24.2885 2084.615
Unfortunately, this is not by creation, but exists courtesy of @thelatemail.
Edit:
Here's a more flexible version that will work on lists whose elements don't necessarily contain the same set of sub-elements.
fun <- function(ll) {
nms <- unique(unlist(lapply(ll, function(X) names(X))))
ll <- lapply(ll, function(X) setNames(X[nms], nms))
ll <- apply(do.call(rbind, ll), 2, as.list)
lapply(ll, function(X) X[!sapply(X, is.null)])
}
## An example of an 'unbalanced' list
z <- list(z1 = list(a = 1, b = 2),
z2 = list(b = 4, a = 1, c = 0))
## Try it out
fun(z)
Original answer
z <- list(z1 = list(a = 1, b = 2, c = 3), z2 = list(b = 4, a = 1, c = 0))
zz <- lapply(z, `[`, names(z[[1]])) ## Get sub-elements in same order
apply(do.call(rbind, zz), 2, as.list) ## Stack and reslice
The recently released purrr
contains a function, transpose
, whose's purpose is to 'revert' a list structure. There is a major caveat to the transpose
function, it matches on position and not name, https://cran.r-project.org/web/packages/purrr/purrr.pdf. These means that it is not the correct tool for the benchmark 1 above. I therefore only consider benchmark 2 and 3 below.
B2 <- list(z1 = list(a = 1, b = 2, c = 'ciao'), z2 = list(a = 0, b = 3, c = 5))
revert_list_str_8 <- function(ll) { # @z109620
transpose(ll)
}
microbenchmark(revert_list_str_1(B2), revert_list_str_3(B2), revert_list_str_4(B2), revert_list_str_7(B2), revert_list_str_8(B2), times = 1e3)
Unit: microseconds
expr min lq mean median uq max neval
revert_list_str_1(B2) 228.752 254.1695 297.066646 268.8325 293.5165 4501.231 1000
revert_list_str_3(B2) 211.645 232.9070 277.149579 250.9925 278.6090 2512.361 1000
revert_list_str_4(B2) 79.673 92.3810 112.889130 100.2020 111.4430 2522.625 1000
revert_list_str_7(B2) 237.062 252.7030 293.978956 264.9230 289.1175 4838.982 1000
revert_list_str_8(B2) 2.445 6.8440 9.503552 9.2880 12.2200 148.591 1000
Clearly function transpose
is the winner! It also utilizes much less code.
B3 <- list(z1 = list(a = 1, b = m, c = 'ciao'), z2 = list(a = 0, b = 3, c = m))
microbenchmark(revert_list_str_1(B3), revert_list_str_3(B3), revert_list_str_4(B3), revert_list_str_7(B3), revert_list_str_8(B3), times = 1e3)
Unit: microseconds
expr min lq mean median uq max neval
revert_list_str_1(B3) 229.242 253.4360 280.081313 266.877 281.052 2818.341 1000
revert_list_str_3(B3) 213.600 232.9070 271.793957 248.304 272.743 2739.646 1000
revert_list_str_4(B3) 80.161 91.8925 109.713969 98.980 108.022 2403.362 1000
revert_list_str_7(B3) 236.084 254.6580 287.274293 264.922 280.319 2718.628 1000
revert_list_str_8(B3) 2.933 7.3320 9.140367 9.287 11.243 55.233 1000
Again, transpose
outperforms all others.
The problem with these above benchmarks test is that they focus on very small lists. For this reason, the numerous loops nested within functions 1-7 do not pose too much of a problem. As the size of the list and therefore the iteration increase, the speed gains of transpose
will likely increase.
The purrr
package is awesome! It does a lot more than revert lists. In combination with the dplyr
package, the purrr
package makes it possible to do most of your coding using the poweriful and beautiful functional programming paradigm. Thank the lord for Hadley!