When I do:
summing += yval * np.log(sigmoid(np.dot(w.transpose(), xi.transpose()))) + (1-yval)* np.log(1-sigmoid(np.dot(w.transpose(), xi.transpose())))
I had this same problem. It looks like you're trying to do logistic regression. I was doing MULTI-CLASS Classification with logistic regression. But you need to solve this problem using the ONE VS ALL approach (google for details).
If you don't set your yval variable so that only has '1' and '0' instead of yval = [1,2,3,4,...] etc., then you will get negative costs which lead to runaway theta and then lead to you reaching the limit of log(y) where y is close to zero.
The fix should be to pre-treat your yval variable so that it only has '1' and '0' for positive and negative examples.