sklearn.datasets.make_blobs(n_samples=100, n_features=2, centers=3, cluster_std=1.0, center_box=(-10.0, 10.0), shuffle=True, random_state=None) [source] Generate isotropic Gaussian blobs for clustering. Read more in the User Guide. Parameters: n_samples : int, optional (default=100) The total number of points equally divided among clusters. n_features : int, optional (default=2) The number of features for each sample. centers : int or array of shape [n_centers, n_features], optional (

sklearn.datasets.make_biclusters(shape, n_clusters, noise=0.0, minval=10, maxval=100, shuffle=True, random_state=None) [source] Generate an array with constant block diagonal structure for biclustering. Read more in the User Guide. Parameters: shape : iterable (n_rows, n_cols) The shape of the result. n_clusters : integer The number of biclusters. noise : float, optional (default=0.0) The standard deviation of the gaussian noise. minval : int, optional (default=10) Minimum value of

sklearn.datasets.load_svmlight_files()

sklearn.datasets.load_svmlight_files(files, n_features=None, dtype=, multilabel=False, zero_based='auto', query_id=False) [source] Load dataset from multiple files in SVMlight format This function is equivalent to mapping load_svmlight_file over a list of files, except that the results are concatenated into a single, flat list and the samples vectors are constrained to all have the same number of features. In case the file contains a pairwise preference constraint (known as ?qid? in the svm

sklearn.datasets.load_svmlight_file()

sklearn.datasets.load_svmlight_file(f, n_features=None, dtype=, multilabel=False, zero_based='auto', query_id=False) [source] Load datasets in the svmlight / libsvm format into sparse CSR matrix This format is a text-based format, with one sample per line. It does not store zero valued features hence is suitable for sparse dataset. The first element of each line can be used to store a target variable to predict. This format is used as the default format for both svmlight and the libsvm comm

sklearn.datasets.load_sample_images()

sklearn.datasets.load_sample_images() [source] Load sample images for image manipulation. Loads both, china and flower. Returns: data : Bunch Dictionary-like object with the following attributes : ?images?, the two sample images, ?filenames?, the file names for the images, and ?DESCR? the full description of the dataset. Examples To load the data and visualize the images: >>> from sklearn.datasets import load_sample_images >>> dataset = load_sample_images() >&g

sklearn.datasets.load_mlcomp()

sklearn.datasets.load_mlcomp(name_or_id, set_='raw', mlcomp_root=None, **kwargs) [source] Load a datasets as downloaded from http://mlcomp.org Parameters: name_or_id : the integer id or the string name metadata of the MLComp dataset to load set_ : select the portion to load: ?train?, ?test? or ?raw? mlcomp_root : the filesystem path to the root folder where MLComp datasets are stored, if mlcomp_root is None, the MLCOMP_DATASETS_HOME environment variable is looked up instead. **kwargs :

sklearn.datasets.load_sample_image()

sklearn.datasets.load_sample_image(image_name) [source] Load the numpy array of a single sample image Parameters: image_name: {`china.jpg`, `flower.jpg`} : The name of the sample image loaded Returns: img: 3D array : The image as a numpy array: height x width x color Examples >>> from sklearn.datasets import load_sample_image >>> china = load_sample_image('china.jpg') >>> china.dtype dtype('uint8') >>> china.shape

sklearn.datasets.load_linnerud()

sklearn.datasets.load_linnerud(return_X_y=False) [source] Load and return the linnerud dataset (multivariate regression). Samples total: 20 Dimensionality: 3 for both data and targets Features: integer Targets: integer Parameters: return_X_y : boolean, default=False. If True, returns (data, target) instead of a Bunch object. See below for more information about the data and target object. New in version 0.18. Returns: data : Bunch Dictionary-like object, the interesting attributes a

sklearn.datasets.load_lfw_people()

Warning DEPRECATED sklearn.datasets.load_lfw_people(*args, **kwargs) [source] DEPRECATED: Function ?load_lfw_people? has been deprecated in 0.17 and will be removed in 0.19.Use fetch_lfw_people(download_if_missing=False) instead. Alias for fetch_lfw_people(download_if_missing=False) Deprecated since version 0.17: This function will be removed in 0.19. Use sklearn.datasets.fetch_lfw_people with parameter download_if_missing=False instead. Check fetch_lfw_people.__doc__ for the documentat

sklearn.datasets.load_iris()

sklearn.datasets.load_iris(return_X_y=False) [source] Load and return the iris dataset (classification). The iris dataset is a classic and very easy multi-class classification dataset. Classes 3 Samples per class 50 Samples total 150 Dimensionality 4 Features real, positive Read more in the User Guide. Parameters: return_X_y : boolean, default=False. If True, returns (data, target) instead of a Bunch object. See below for more information about the data and target object. New in version