How to correctly get the weights using spark for synthetic dataset?
问题 I'm doing LogisticRegressionWithSGD on spark for synthetic dataset. I've calculated the error on matlab using vanilla gradient descent and on R which is ~5%. I got similar weight that was used in the model that I used to generate y. The dataset was generated using this example. While I am able to get very close error rate at the end with different stepsize tuning, the weights for individual feature isn't the same. In fact, it varies a lot. I tried LBFGS for spark and it's able to predict both