sklearn.datasets.load_iris(return_X_y=False) [source] Load and return the iris dataset (classification). The iris dataset is a classic and very easy multi-class classification dataset. Classes 3 Samples per class 50 Samples total 150 Dimensionality 4 Features real, positive Read more in the User Guide. Parameters: return_X_y : boolean, default=False. If True, returns (data, target) instead of a Bunch object. See below for more information about the data and target object. New in version

sklearn.datasets.load_files(container_path, description=None, categories=None, load_content=True, shuffle=True, encoding=None, decode_error='strict', random_state=0) [source] Load text files with categories as subfolder names. Individual samples are assumed to be files stored a two levels folder structure such as the following: container_folder/ category_1_folder/ file_1.txt file_2.txt ... file_42.txt category_2_folder/ file_43.txt file_44.txt ... The folder names are used as supervised

sklearn.datasets.load_digits()

sklearn.datasets.load_digits(n_class=10, return_X_y=False) [source] Load and return the digits dataset (classification). Each datapoint is a 8x8 image of a digit. Classes 10 Samples per class ~180 Samples total 1797 Dimensionality 64 Features integers 0-16 Read more in the User Guide. Parameters: n_class : integer, between 0 and 10, optional (default=10) The number of classes to return. return_X_y : boolean, default=False. If True, returns (data, target) instead of a Bunch object. See b

sklearn.datasets.load_diabetes()

sklearn.datasets.load_diabetes(return_X_y=False) [source] Load and return the diabetes dataset (regression). Samples total 442 Dimensionality 10 Features real, -.2 < x < .2 Targets integer 25 - 346 Read more in the User Guide. Parameters: return_X_y : boolean, default=False. If True, returns (data, target) instead of a Bunch object. See below for more information about the data and target object. New in version 0.18. Returns: data : Bunch Dictionary-like object, the interestin

sklearn.datasets.load_breast_cancer()

sklearn.datasets.load_breast_cancer(return_X_y=False) [source] Load and return the breast cancer wisconsin dataset (classification). The breast cancer dataset is a classic and very easy binary classification dataset. Classes 2 Samples per class 212(M),357(B) Samples total 569 Dimensionality 30 Features real, positive Parameters: return_X_y : boolean, default=False If True, returns (data, target) instead of a Bunch object. See below for more information about the data and target object. N

sklearn.datasets.load_boston()

sklearn.datasets.load_boston(return_X_y=False) [source] Load and return the boston house-prices dataset (regression). Samples total 506 Dimensionality 13 Features real, positive Targets real 5. - 50. Parameters: return_X_y : boolean, default=False. If True, returns (data, target) instead of a Bunch object. See below for more information about the data and target object. New in version 0.18. Returns: data : Bunch Dictionary-like object, the interesting attributes are: ?data?, the dat

sklearn.datasets.get_data_home()

sklearn.datasets.get_data_home(data_home=None) [source] Return the path of the scikit-learn data dir. This folder is used by some large dataset loaders to avoid downloading the data several times. By default the data dir is set to a folder named ?scikit_learn_data? in the user home folder. Alternatively, it can be set by the ?SCIKIT_LEARN_DATA? environment variable or programmatically by giving an explicit folder path. The ?~? symbol is expanded to the user home folder. If the folder does n

sklearn.datasets.fetch_species_distributions()

sklearn.datasets.fetch_species_distributions(data_home=None, download_if_missing=True) [source] Loader for species distribution dataset from Phillips et. al. (2006) Read more in the User Guide. Parameters: data_home : optional, default: None Specify another download and cache folder for the datasets. By default all scikit learn data is stored in ?~/scikit_learn_data? subfolders. download_if_missing : optional, True by default If False, raise a IOError if the data is not locally availabl

sklearn.datasets.fetch_rcv1()

sklearn.datasets.fetch_rcv1(data_home=None, subset='all', download_if_missing=True, random_state=None, shuffle=False) [source] Load the RCV1 multilabel dataset, downloading it if necessary. Version: RCV1-v2, vectors, full sets, topics multilabels. Classes 103 Samples total 804414 Dimensionality 47236 Features real, between 0 and 1 Read more in the User Guide. New in version 0.17. Parameters: data_home : string, optional Specify another download and cache folder for the datasets. By defa

sklearn.datasets.fetch_olivetti_faces()

sklearn.datasets.fetch_olivetti_faces(data_home=None, shuffle=False, random_state=0, download_if_missing=True) [source] Loader for the Olivetti faces data-set from AT&T. Read more in the User Guide. Parameters: data_home : optional, default: None Specify another download and cache folder for the datasets. By default all scikit learn data is stored in ?~/scikit_learn_data? subfolders. shuffle : boolean, optional If True the order of the dataset is shuffled to avoid having images of t