BNlearn R error “variable Variable1 must have at least two levels.”

巧了我就是萌 提交于 2019-12-31 04:10:08

问题


Trying to create a BN using BNlearn, but I keep getting an error;

Error in check.data(data, allowed.types = discrete.data.types) : variable Variable1 must have at least two levels.

It gives me that error for every of my variable, even though they're all factors and has more than 1 levels, As you can see - in this case my variable "model" has 4 levels

As I can't share the variables and dataset, I've created a small set and belonging code to the data set. I get the same problem. I know I've only shared 2 variables, but I get the same error for all the variables.

library(tidyverse)
library (bnlearn)
library(openxlsx)

DataFull <- read.xlsx("(.....)/test.xlsx", sheet = 1, startRow = 1, colNames = TRUE)
set.seed(600)
DataFull <- as_tibble(DataFull)

DataFull$Variable1 <- as.factor(DataFull$Variable1)
DataFull$TargetVar <- as.factor(DataFull$TargetVar)

DataFull <- na.omit(DataFull)
DataFull <- droplevels(DataFull)

DataFull <- DataFull[sample(nrow(DataFull)),]
Data <- DataFull[1:as.integer(nrow(DataFull)*0.70)-1,]
Datatest <- DataFull[as.integer(nrow(DataFull)*0.70):nrow(DataFull),]
nrow(Data)+nrow(Datatest)==nrow(DataFull)

FocusVar <- as.character("TargetVar")
BN.naive <- naive.bayes(Data, FocusVar) 

Using str(data), I can see that the variable has 2 or more levels already:

str(Data)

Classes ‘tbl_df’, ‘tbl’ and 'data.frame':   27586 obs. of  2 variables:
 $ Variable1: Factor w/ 3 levels "Small","Medium",..: 2 2 3 3 3 3 3 3 3 3 ...
 $ TargetVar: Factor w/ 2 levels "Yes","No": 1 1 1 1 1 1 2 1 1 1 ...

Link to data set: https://drive.google.com/open?id=1VX2xkPdeHKdyYqEsD0FSm1BLu1UCtOj9eVIVfA_KJ3g


回答1:


bnlearn expects a data.frame : doesn't work with tibbles, So keep your data as a data.frame by omitting the line DataFull <- as_tibble(DataFull)

Example

library(tibble)
library (bnlearn)

d <- as_tibble(learning.test)
hc(d)

Error in check.data(x) : variable A must have at least two levels.

In particular, it is the line from bnlearn:::check.data

if (nlevels(x[, col]) < 2) 
      stop("variable ", col, " must have at least two levels.")

In a standard data.frame,learning.test[,"A"] returns a vector and so nlevels(learning.test[,"A"]) works as expected, however, by design, you cannot extract vectors like this from tibbles : d[,"A"]) is still a tbl_df and not a vector hence nlevels(d[,"A"]) doesn't work as expected, and returns zero.



来源:https://stackoverflow.com/questions/54922515/bnlearn-r-error-variable-variable1-must-have-at-least-two-levels

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!