nested sapply in R - breakdown

丶灬走出姿态 提交于 2020-04-18 05:43:17

问题


This post is related to my previous question about extracting data from nested lists, which has been answered. One of the answers contains a sapply function:

usageExist <- sapply(garden$fruit, function(f){
  sapply(garden$usage, '%in%', x = names(productFruit$type[[f]][["usage"]]))}) 

I am very new to data.table and apply functions and struggle to understand:

what is happening in this particular line of code ?

Why does cooking appear twice in the lists after running usageExists ?

What is the purpose of the argument f in the function within sapply

The structure and results of the data are provided below:

> str(productFruit)
List of 2
 $ Basket: chr "DUH"
 $ type  :List of 3
  ..$ Fruit 1124:List of 3
  .. ..$ ID   : num 1
  .. ..$ color: chr "poor"
  .. ..$ usage:List of 2
  .. .. ..$ eating  :List of 3
  .. .. .. ..$ ID      : num 1
  .. .. .. ..$ quality : chr "good"
  .. .. .. ..$ calories: num 500
  .. .. ..$ medicine:List of 3
  .. .. .. ..$ ID      : num 2
  .. .. .. ..$ quality : chr "poor"
  .. .. .. ..$ calories: num 300
  ..$ Fruit 1068:List of 3
  .. ..$ ID   : num [1:3] 1 2 3
  .. ..$ color: num [1:3] 3 4 5
  .. ..$ usage:List of 4
  .. .. ..$ eating  :List of 3
  .. .. .. ..$ ID      : num 1
  .. .. .. ..$ quality : chr "poor"
  .. .. .. ..$ calories: num 420
  .. .. ..$ cooking :List of 3
  .. .. .. ..$ ID      : num 2
  .. .. .. ..$ quality : chr "questionable"
  .. .. .. ..$ calories: num 600
  .. .. ..$ drinking:List of 3
  .. .. .. ..$ ID      : num 3
  .. .. .. ..$ quality : chr "good"
  .. .. .. ..$ calories: num 800
  .. .. ..$ medicine:List of 3
  .. .. .. ..$ ID      : num 4
  .. .. .. ..$ quality : chr "good"
  .. .. .. ..$ calories: num 0
  ..$ Fruit 1051:List of 3
  .. ..$ ID   : num [1:3] 1 2 3
  .. ..$ color: num [1:3] 3 4 5
  .. ..$ usage:List of 3
  .. .. ..$ cooking :List of 3
  .. .. .. ..$ ID      : num 1
  .. .. .. ..$ quality : chr "good"
  .. .. .. ..$ calories: num 49
  .. .. ..$ drinking:List of 3
  .. .. .. ..$ ID      : num 2
  .. .. .. ..$ quality : chr "questionable"
  .. .. .. ..$ calories: num 11
  .. .. ..$ medicine:List of 3
  .. .. .. ..$ ID      : num 3
  .. .. .. ..$ quality : chr "poor"
  .. .. .. ..$ calories: num 55


> str(garden)
Classes ‘data.table’ and 'data.frame':  5 obs. of  3 variables:
 $ fruit   : chr  "Fruit 1124" "Fruit 100" "Fruit 1051" "Fruit 1068" ...
 $ usage   : chr  "cooking" "cooking" "NA" "drinking" ...
 $ reported: chr  "200" "500" "77" "520" ...
 - attr(*, ".internal.selfref")=<externalptr> 


> fruitExist <- fruit %in% names(productFruit$type) 
> fruitExist
[1]  TRUE FALSE  TRUE  TRUE FALSE


> usageExist <- sapply(garden$fruit, function(f){
+   sapply(garden$usage, '%in%', x = names(productFruit$type[[f]][["usage"]]))}) # return a list of 5
> usageExist
$`Fruit 1124`
     cooking cooking    NA drinking medicine
[1,]   FALSE   FALSE FALSE    FALSE    FALSE
[2,]   FALSE   FALSE FALSE    FALSE     TRUE

$`Fruit 100`
$`Fruit 100`$cooking
logical(0)

$`Fruit 100`$cooking
logical(0)

$`Fruit 100`$`NA`
logical(0)

$`Fruit 100`$drinking
logical(0)

$`Fruit 100`$medicine
logical(0)


$`Fruit 1051`
     cooking cooking    NA drinking medicine
[1,]    TRUE    TRUE FALSE    FALSE    FALSE
[2,]   FALSE   FALSE FALSE     TRUE    FALSE
[3,]   FALSE   FALSE FALSE    FALSE     TRUE

$`Fruit 1068`
     cooking cooking    NA drinking medicine
[1,]   FALSE   FALSE FALSE    FALSE    FALSE
[2,]    TRUE    TRUE FALSE    FALSE    FALSE
[3,]   FALSE   FALSE FALSE     TRUE    FALSE
[4,]   FALSE   FALSE FALSE    FALSE     TRUE

$`Fruit 1`
$`Fruit 1`$cooking
logical(0)

$`Fruit 1`$cooking
logical(0)

$`Fruit 1`$`NA`
logical(0)

$`Fruit 1`$drinking
logical(0)

$`Fruit 1`$medicine
logical(0)

回答1:


Well, this is essentially a nested loop. sapply(x, function(f) ...) simply takes each element in x and passes it as the argument f to the function. That function in your case is just another sapply statement.

So, usageExist <- sapply(garden$fruit, function(f){...} simply passes each fruit in garden to the function. In your case, this affects names(productFruit$type[[**f**]][["usage"]]. For instance, for the first one, it passes Fruit 1124 from garden into the second sapply, where productFruit$type[[f]] looks up Fruit 1124 from productFruit, and in particular the usage element of that list.

The second sapply, on the other hand, takes every element of garden$usage and passes it to the %in% function. You get cooking twice because, as you can see in your str output, it appears twices in that data, which makes sense as you can cook a variety of fruits and vegetables, and not just one.



来源:https://stackoverflow.com/questions/60912793/nested-sapply-in-r-breakdown

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!