Can't load 'mnist-original' dataset using sklearn

情到浓时终转凉″ 提交于 2019-12-04 19:56:15

I just faced the same issue and it took me some time to find the problem. One reason is, data can be corrupted during the first download. Remove the cached data. Find the scikit data home dir as follows:

from sklearn.datasets.base import get_data_home 
print (get_data_home())

Clean the directory and redownload the dataset. This solution works for me. For reference: https://github.com/ageron/handson-ml/issues/143

This is also related with the following question: How to use datasets.fetch_mldata() in sklearn?

A quick update for the question here:

mldata.org seems to still be down. Then scikit-learn will remove fetch_mldata.

Solution for the moment: Since using the lines above will create a empty folder a the place of data_home, find the copy of the data here: https://github.com/amplab/datascience-sp14/blob/master/lab7/mldata/mnist-original.mat and download it. Then place it the ~/sklearn_data/mldata/ which is empty.

It worked for me.

Instead of :

from sklearn.datasets.mldata import fetch_mldata

use:

from sklearn.datasets import fetch_mldata

And then:

mnist = fetch_mldata('MNIST original')
X = mnist.data.astype('float64')
y = mnist.target

Please see this example:

For people having the same issue: it was a connection problem. If you get a similar error, check that you have the entire mnist-original.mat file, as suggested by @vivek-kumar. Current file size: 55.4 MB.

In the latest sklearn version (0.21) use this:

import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_digits

digits = load_digits()

X = digits.data
y = digits.target

Try this one, this will work.

from sklearn.datasets import fetch_mldata
mnist = fetch_mldata('MNIST original')
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!