The basic task that I have at hand is
a) Read some tab separated data.
b) Do some basic preprocessing
c) For each categorical column use LabelE
According to the LabelEncoder implementation, the pipeline you've described will work correctly if and only if you fit LabelEncoders at the test time with data that have exactly the same set of unique values.
There's a somewhat hacky way to reuse LabelEncoders you got during train. LabelEncoder has only one property, namely, classes_. You can pickle it, and then restore like
Train:
encoder = LabelEncoder()
encoder.fit(X)
numpy.save('classes.npy', encoder.classes_)
Test
encoder = LabelEncoder()
encoder.classes_ = numpy.load('classes.npy')
# Now you should be able to use encoder
# as you would do after `fit`
This seems more efficient than refitting it using the same data.