I am trying to run grangercausalitytests on two time series:
import numpy as np
import pandas as pd
from statsmodels.tsa.stattools import grang
The problem arises due to the perfect correlation between the two series in your data. From the traceback, you can see, that internally a wald test is used to compute the maximum likelihood estimates for the parameters of the lag-time series. To do this an estimate of the parameters covariance matrix (which is then near-zero) and its inverse is needed (as you can also see in the line invcov = np.linalg.inv(cov_p) in the traceback). This near-zero matrix is now singular for some maximum lag number (>=5) and thus the test crashes. If you add just a little noise to your data, the error disappears:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from statsmodels.tsa.stattools import grangercausalitytests
n = 1000
ls = np.linspace(0, 2*np.pi, n)
df1Clean = pd.DataFrame(np.sin(ls))
df2Clean = pd.DataFrame(2*np.sin(ls+1))
dfClean = pd.concat([df1Clean, df2Clean], axis=1)
dfDirty = dfClean+0.00001*np.random.rand(n, 2)
grangercausalitytests(dfClean, maxlag=20, verbose=False) # Raises LinAlgError
grangercausalitytests(dfDirty, maxlag=20, verbose=False) # Runs fine
Another thing to keep an eye out for is duplicate columns. Duplicate columns will have a correlation score of 1.0, resulting in singularity. Otherwise, it's also possible you have 2 features that are perfectly correlated. And easy way to check this is with df.corr(), and look for pairs of columns with correlation = 1.0.