The pooled covariance matrix of TRAINING must be positive definite

匿名 (未验证) 提交于 2019-12-03 10:24:21

问题:

I know this question has already been asked a couple of times, but I couldn't find a solution to my problem.

I don't have more variables than observations and I don't have NAN values in my matrix. Here's my function:

function [ind, idx_ran] = fselect(features_f, class_f, dir)  idx = linspace(1,size(features_f, 2), size(features_f, 2));  idx_ran = idx(:,randperm(size(features_f, 2)));  features_t_ran = features_f(:,idx_ran); % randomize colums  len = length(class_f);  r = randi(len, [1, round(len*0.15)]);  x = features_t_ran; y = class_f;  xtrain = x; ytrain = y;  xtrain(r,:) = []; ytrain(r,:) = [];  xtest = x(r,:); ytest = y(r,:);  f = @(xtrain, ytrain, xtest, ytest)(sum(~strcmp(ytest, classify(xtest, xtrain, ytrain)))); fs = sequentialfs(f, x, y, 'direction', dir);  ind = find(fs < 1);  end

and here are my test and training data.

>> whos xtest   Name         Size             Bytes  Class     Attributes    xtest      524x42            176064  double                >> whos xtrain   Name           Size              Bytes  Class     Attributes    xtrain      3008x42            1010688  double                >> whos ytest   Name         Size            Bytes  Class    Attributes    ytest      524x1             32488  cell                 >> whos ytrain   Name           Size             Bytes  Class    Attributes    ytrain      3008x1             186496  cell                 >> 

and here's the error,

Error using crossval>evalFun (line 465) The function '@(xtrain,ytrain,xtest,ytest)(sum(~strcmp(ytest,classify(xtest,xtrain,ytrain))))' generated the following error: The pooled covariance matrix of TRAINING must be positive definite.  Error in crossval>getFuncVal (line 482) funResult = evalFun(funorStr,arg(:));  Error in crossval (line 324)     funResult = getFuncVal(1, nData, cvp, data, funorStr, []);  Error in sequentialfs>callfun (line 485)     funResult = crossval(fun,x,other_data{:},...  Error in sequentialfs (line 353)                 crit(k) = callfun(fun,x,other_data,cv,mcreps,ParOptions);  Error in fselect (line 26) fs = sequentialfs(f, x, y, 'direction', dir);  Error in workflow_forward (line 31)     [ind, idx_ran] = fselect(features_f, class_f, 'forward');

this was working yesterday. :/

回答1:

If you inspect function classify you find that the error is generated when the program checks the condition number of the matrix R obtained from QR decomposition of your training matrix. In other words, it is unhappy with the training matrix you are providing. It finds that this matrix is ill-conditioned and therefore any solution would be unstable (the function performs the equivalent of a matrix inversion which would lead to the equivalent of division by a very small number for an ill-conditioned training matrix).

It seems that by shrinking the size of your training set the stability was reduced. My suggestion is to use a larger training set if possible.

Edit

You may be wondering how it is possible to have more observations than variables and still have an ill-conditioned problem. The answer is that different observations can be linear combinations of each other.



标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!