Convert MNIST data from numpy arrays to original ubyte data

怎甘沉沦 提交于 2021-01-29 20:02:28

问题


I used this code almost exactly, just changing the line:

f = gzip.open("../data/mnist.pkl.gz", 'rb')
training_data, validation_data, test_data = cPickle.load(f)

to these lines:

import pickle as cPickle
f = gzip.open("mnist.pkl.gz", 'rb')
u = cPickle._Unpickler(f)
u.encoding='latin1'
training_data, validation_data, test_data = u.load()

to account for pickling issues.The original mnist.pkl.gz was downloaded from his repo (available here), or the code to generate the .pkl.gz is here. The output is great, it's a pickled numpy array of the training and test data, and on inspection, I can see if I print the length of the training data, it's 250,000 numpy arrays.

I need to get the data back into the exact format as the original MNIST data (i.e. ubyte, training and testing data and labels separate) to be put into an external pipeline that i have no control over, so it must be the same as the original.

I'm really stuck on how to do this. I can see for example things like this that might help, but I can't see how it suits this problem. If someone could help me revert the output from this pickled numpy arrays to the original MNIST format (i.e. ubyte, training and testing data and labels separate), i'd really appreciate it.

Edit 1: Something I've just realised that might be easier, I actually only need to convert the training data into ubyte format, not the testing one, since I already have the testing data in ubyte format in the original.

来源:https://stackoverflow.com/questions/65156592/convert-mnist-data-from-numpy-arrays-to-original-ubyte-data

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!