I modified the BernoulliRBM class of scikit-learn to use groups of softmax visible units. In the process, I added an extra Numpy array visible_config as a class
Without seeing your code, it's hard to tell exactly what goes wrong, but you are violating a scikit-learn API convention here. The constructor in an estimator should only set attributes to the values the user passes as arguments. All computation should occur in fit, and if fit needs to store the result of a computation, it should do so in an attribute with a trailing underscore (_). This convention is what makes clone and meta-estimators such as GridSearchCV work.
(*) If you ever see an estimator in the main codebase that violates this rule: that would be a bug, and patches are welcome.