Skip to content Skip to sidebar Skip to footer

Can't Load 'mnist-original' Dataset Using Sklearn

This question is similar to what asked here and here. Unfortunately, in my case the suggested solution didn't fix the problem. I need to work with the MNIST dataset but I can't fe

Solution 1:

Unfortunately fetch_mldata() has been replaced in the latest version of sklearn as fetch_openml().

So, instead of using:

from sklearn.datasets importfetch_mldatamnist= fetch_mldata('MNIST original')

You must use:

from sklearn.datasets importfetch_openmlmnist= fetch_openml('mnist_784')
x = mnist.datay= mnist.target

shape of x will be = (70000,784) shape of y will be = (70000,)

Solution 2:

A quick update for the question here:

mldata.org seems to still be down. Then scikit-learn will remove fetch_mldata.

Solution for the moment: Since using the lines above will create a empty folder a the place of data_home, find the copy of the data here: https://github.com/amplab/datascience-sp14/blob/master/lab7/mldata/mnist-original.mat and download it. Then place it the ~/sklearn_data/mldata/ which is empty.

It worked for me.

Solution 3:

I just faced the same issue and it took me some time to find the problem. One reason is, data can be corrupted during the first download. Remove the cached data. Find the scikit data home dir as follows:

from sklearn.datasets.base import get_data_home 
print (get_data_home())

Clean the directory and redownload the dataset. This solution works for me. For reference: https://github.com/ageron/handson-ml/issues/143

This is also related with the following question: How to use datasets.fetch_mldata() in sklearn?

Solution 4:

Instead of :

from sklearn.datasets.mldataimport fetch_mldata

use:

from sklearn.datasetsimport fetch_mldata

And then:

mnist = fetch_mldata('MNIST original')
X = mnist.data.astype('float64')
y = mnist.target

Please see this example:

Solution 5:

For people having the same issue: it was a connection problem. If you get a similar error, check that you have the entire mnist-original.mat file, as suggested by @vivek-kumar. Current file size: 55.4 MB.

Post a Comment for "Can't Load 'mnist-original' Dataset Using Sklearn"