I have a training set that looks like
Name Day Area X Y Month Night
ATTACK Monday LA -122.41 37.78 8 0
VEHICLE Saturday CHICAGO -1.67 3.15 2 0
MOUSE Monday TAIPEI -12.5 3.1 9 1
Name is the outcome/dependent variable. I converted Name, Area and Day into factors, but I wasn't sure if I was supposed to for Month and Night, which only take on integer values 1-12 and 0-1, respectively.
I then convert the data into matrix
ynn <- model.matrix(~Name , data = trainDF)
mnn <- model.matrix(~ Day+Area +X + Y + Month + Night, data = trainDF)
I then setup tuning the parameters
nnTrControl=trainControl(method = "repeatedcv",number = 3,repeats=5,verboseIter = TRUE, returnData = FALSE, returnResamp = "all", classProbs = TRUE, summaryFunction = multiClassSummary,allowParallel = TRUE)
nnGrid = expand.grid(.size=c(1,4,7),.decay=c(0,0.001,0.1))
model <- train(y=ynn, x=mnn, method='nnet',linout=TRUE, trace = FALSE, trControl = nnTrControl,metric="logLoss", tuneGrid=nnGrid)
However, I get the error Error: nrow(x) == n is not TRUE for the model<-train
I also get a similar error if I use xgboost instead of nnet
Anyone know whats causing this?
y should be a numeric or factor vector containing the outcome for each sample, not a matrix. Using
train(y = make.names(trainDF$Name), ...)
helps, where make.names modifies values so that they could be valid variable names.
Even though in the help file of train said either maxtrix or data frame would be expected, but you can try to convert the matrix to a data frame:
model <- train(y=ynn, x=as.data.frame(mnn), method='nnet',linout=TRUE, trace = FALSE, trControl = nnTrControl,metric="logLoss", tuneGrid=nnGrid)
来源:https://stackoverflow.com/questions/35527492/error-nrowx-n-is-not-true-when-using-train-in-caret