When using LabelPropagation, I often run into this warning (imho it should be an error because it completely fails the propagation):
/usr/local/lib/python
Basically you're doing a softmax function, right?
The general way to prevent softmax from over/underflowing is (from here)
# Instead of this . . .
def softmax(x, axis = 0):
return np.exp(x) / np.sum(np.exp(x), axis = axis, keepdims = True)
# Do this
def softmax(x, axis = 0):
e_x = np.exp(x - np.max(x, axis = axis, keepdims = True))
return e_x / e_x.sum(axis, keepdims = True)
This bounds e_x between 0 and 1, and assures one value of e_x will always be 1 (namely the element np.argmax(x)). This prevents overflow and underflow (when np.exp(x.max()) is either bigger or smaller than float64 can handle).
In this case, as you can't change the algorithm, I would take the input D and make D_ = D - D.min() as this should be numerically equivalent to the above, as W.max() should be -gamma * D.min() (as you're just flipping the sign). The do your algorithm with regards to D_
EDIT:
As recommended by @PaulBrodersen below, you can build a "safe" rbf kernel based on the sklearn implementation here:
def rbf_kernel_safe(X, Y=None, gamma=None):
X, Y = sklearn.metrics.pairwise.check_pairwise_arrays(X, Y)
if gamma is None:
gamma = 1.0 / X.shape[1]
K = sklearn.metrics.pairwise.euclidean_distances(X, Y, squared=True)
K *= -gamma
K -= K.max()
np.exp(K, K) # exponentiate K in-place
return K
And then use it in your propagation
LabelPropagation(kernel = rbf_kernel_safe, tol = 0.01, gamma = 20).fit(X, Y)
Unfortunately I only have v0.18, which doesn't accept user-defined kernel functions for LabelPropagation, so I can't test it.
EDIT2:
Checking your source for why you have such large gamma values makes me wonder if you are using gamma = D.min()/3, which would be incorrect. The definition is sigma = D.min()/3 while the definition of sigma in w is
w = exp(-d**2/sigma**2) # Equation (1)
which would make the correct gamma value 1/sigma**2 or 9/D.min()**2