Can't Load 'mnist-original' Dataset Using Sklearn
Solution 1:
Unfortunately fetch_mldata() has been replaced in the latest version of sklearn as fetch_openml().
So, instead of using:
from sklearn.datasets importfetch_mldatamnist= fetch_mldata('MNIST original')
You must use:
from sklearn.datasets importfetch_openmlmnist= fetch_openml('mnist_784')
x = mnist.datay= mnist.target
shape of x will be = (70000,784) shape of y will be = (70000,)
Solution 2:
A quick update for the question here:
mldata.org seems to still be down. Then scikit-learn will remove fetch_mldata.
Solution for the moment: Since using the lines above will create a empty folder a the place of data_home, find the copy of the data here: https://github.com/amplab/datascience-sp14/blob/master/lab7/mldata/mnist-original.mat and download it. Then place it the ~/sklearn_data/mldata/ which is empty.
It worked for me.
Solution 3:
I just faced the same issue and it took me some time to find the problem. One reason is, data can be corrupted during the first download. Remove the cached data. Find the scikit data home dir as follows:
from sklearn.datasets.base import get_data_home
print (get_data_home())
Clean the directory and redownload the dataset. This solution works for me. For reference: https://github.com/ageron/handson-ml/issues/143
This is also related with the following question: How to use datasets.fetch_mldata() in sklearn?
Solution 4:
Instead of :
from sklearn.datasets.mldataimport fetch_mldata
use:
from sklearn.datasetsimport fetch_mldata
And then:
mnist = fetch_mldata('MNIST original')
X = mnist.data.astype('float64')
y = mnist.target
Please see this example:
Solution 5:
For people having the same issue: it was a connection problem. If you get a similar error, check that you have the entire mnist-original.mat
file, as suggested by @vivek-kumar. Current file size: 55.4 MB.
Post a Comment for "Can't Load 'mnist-original' Dataset Using Sklearn"