Have a large dataset with many categorical data, want to use sklearn to do the one hot encoding. There is one question that sklearn only handle the categories in current d