Expand grid in R with unlist and apply

自闭症网瘾萝莉.ら 提交于 2021-01-29 05:16:09

问题


I am looking to use R's expand.grid to comprehensively enumerate and investigate options for hierarchical clustering analysis. I have a final function acc which will take a matrix and analyse it for performance measures like accuracy, precision, F1 etc., returning a named list (with accuracy, F1, etc.): the ultimate output I'm looking for is a table where all the hyperparameter combinations are listed and, in columns next to them, the different performance measures (accuracy, F1,...).

The table of combinations can be set up for example with

hyperparams =  expand.grid(meths=c("ward.D","ward.D2","single","complete","average","mcquitty","median","centroid"), dists=c("euclidean", "maximum", "manhattan", "canberra", "binary","minkowski"))

Next we would compare to known labels and get the accuracy, wrapping in a number of functions, which I've tried to omit for brevity (like cutree):

t1 = table(df$Group, hclust(dist(df[-1],method="euclidean"), method="complete"))
Res1 = acc(t1)

The goal is to vary the method argument for dist across those listed in my dists, and the method argument for hclust across those listed in my meths. In the final line, recall that I've written acc, which will take a matrix and output a named list of accuracy, precision, F1,... which I'd like each on a column of a final table, whose rows are the hyperparameter combinations in hyperparams.

Now, my first issue is, I'm not sure how to use unlist in a way that will cover all the options above. I'm pretty sure it's the right function but just not sure how to do it. And I also want to create the table without a for-loop, i.e. using apply or something like that (I guess applying along the rows of hyperparams?...), since I know such solutions are generally better in R.

As suggested, the final desired output would be, effectively, hyperparams but as a data-frame with additional columns, the third column containing accuracy, fourth containing precision, etc (the measures listed out in my function acc). Can anyone inform me how to get there?

If you want something to play with for acc, we could use

first = sum(x)
second = sum(x^2)
return(list(First=first,Second=second))

and the final output table would be the two hyperparameter columns followed by a column for First (sum of elements in the final confusion matrix, for the hyperparameter combo corresponding to that row) and Second (sum of elements^2 in the final confusion matrix). Just a hypothetical example in case you like to work with given functions.

I'd really prefer solutions in base R! (Or dplyr if absolutely necessary)

Edit: OK, many people are asking for a df. Let's use iris, but of course if we want output we can't avoid some of the intermediate functions, like cutree.

Now with iris, you could run

contingtab1 = table(iris$Species, cutree(hclust(dist(iris[,1:4],method="euclidean"),method="complete"),3))

That gives a contingency table. Passing this into acc would give one row of the desired output (the row corresponding to euclidean and complete. The desired output would then look like hyperparams with each of the two current columns followed by (say) two more columns, one for each of my two performance measures in acc.


回答1:


We can use Map in base R

Map(function(x, y) acc(hclust(dist(df[-1],method = x), method = y),
           hyperparams[[1]], hyperparams[[2]])



回答2:


One approach might be map2 from purrr

library(purrr)
map2(hyperparams$meths, hyperparams$dists,
     ~ acc(hclust(dist(df[-1],method = .x), method = .y)))


来源:https://stackoverflow.com/questions/61687341/expand-grid-in-r-with-unlist-and-apply

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!