If a sklearn.LabelEncoder has been fitted on a training set, it might break if it encounters new values when used on a test set.
The only solution I c
I get the impression that what you've done is quite similar to what other people do when faced with this situation.
There's been some effort to add the ability to encode unseen labels to the LabelEncoder (see especially https://github.com/scikit-learn/scikit-learn/pull/3483 and https://github.com/scikit-learn/scikit-learn/pull/3599), but changing the existing behavior is actually more difficult than it seems at first glance.
For now it looks like handling "out-of-vocabulary" labels is left to individual users of scikit-learn.