Getting LinAlgError: Singular matrix Error

三世轮回 提交于 2021-02-05 08:10:30

问题


I'm using the below function to calculate p-value to build fit logistic regression model. But I get LinAlgError: Singular matrix error.

from sklearn import linear_model
import scipy.stats as stat

class LogisticRegression_with_p_values:
    def __init__(self, *args, **kwargs):
      self.model = linear_model.LogisticRegression(*args, **kwargs)
    def fit(self, X, y):
      self.model.fit(X, y)
      denom = (2.0 * (1.0 + np.cosh(self.model.decision_function(X))))
      denom = np.tile(denom, (X.shape[1],1)).T
      F_ij = np.dot((X/denom).T,X)
      Cramer_Rao = np.linalg.inv(F_ij)
      sigma_estimates = np.sqrt(np.diagonal(Cramer_Rao))
      z_scores = self.model.coef_[0] / sigma_estimates
      p_values = [stat.norm.sf(abs(x)) * 2 for x in z_scores]
      self.coef_ = self.model.coef_
      self.intercept_ = self.model.intercept_
      self.p_values = p_values
reg = LogisticRegression_with_p_values()
reg.fit(inputs_train, loan_data_targets_train)

回答1:


Error : reg = LogisticRegression_with_p_values() LinAlgError: Singular matrix Fitting the Model after the P-value function throws error: LinAlgError: Singular matrix

Step 1: run the below code and observe any missing values in the green line.

corr = inputs_train.corr() 

kot = corr[corr>=.9] 

plt.figure(figsize=(18,10)) 

sns.heatmap(kot, cmap="Greens")

I am working on Lending Club analysis and encountered this error so used the above-mentioned heatmap and found a missing value in the green line so to further investigate I ran the below code to check if the output is 'nan' inputs_train['term:36'].corr(inputs_train['term:36'])

O/P:: nan

Next step :

When we create the variable 'term_int', we have not converted it to numerical from string. To verify this check output if type(df_inputs_prepr_train['term_int'][0]) gives 'STR'

so when we write :

df_inputs_prepr_train[df_inputs_prepr_train['term_int']==36]['term_int']

it shows the output as zero rows since 'term_int' is still an str 36 and not a numerical 36.

so when we use the code ::

 df_inputs_prepr_train['term:36']=np.where((df_inputs_prepr_train['term_int']==36),1,0)

  it basically stores only zeroes as output. 

Action to be taken ::

`df_inputs_prepr_train['term_int']=pd.to_numeric(df_inputs_prepr_train['term_int'])`

Cross-verify: if you run heatmap again you won't see any missing values in the green line



来源:https://stackoverflow.com/questions/61744118/getting-linalgerror-singular-matrix-error

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!