I am trying to apply both imputation and hot one encoding on my data set. I know that on applying imputation, the dimension of data might change and so I took care of it manuall
I've been struggling with a similar problem and I've found an approach that might help in this situation.
The main idea is to modify the type of the column to make it categorical when you are working with the complete dataset. Doing something like this:
dataframe[column] = dataframe[column].astype('category')
When you do that the dataframe's column will saved all the available categories. Later when you perform a train/test split of the data the categories will be saved even though the values might not be presented on one of the dataset.
Pandas get_dummies function uses the categories of the column to perform the encoding. Since the categories are stable you will always get the same amount of columns after encoding.
I'm exploring this solution myself. Keep in mind that you can manipulate the categories directly in case you need to. You can use something like this
dataframe[column].cat.set_categories([.....])