covariance.MinCovDet()

class sklearn.covariance.MinCovDet(store_precision=True, assume_centered=False, support_fraction=None, random_state=None) [source] Minimum Covariance Determinant (MCD): robust estimator of covariance. The Minimum Covariance Determinant covariance estimator is to be applied on Gaussian-distributed data, but could still be relevant on data drawn from a unimodal, symmetric distribution. It is not meant to be used with multi-modal data (the algorithm used to fit a MinCovDet object is likely to

sklearn.random_projection.johnson_lindenstrauss_min_dim()

sklearn.random_projection.johnson_lindenstrauss_min_dim(n_samples, eps=0.1) [source] Find a ?safe? number of components to randomly project to The distortion introduced by a random projection p only changes the distance between two points by a factor (1 +- eps) in an euclidean space with good probability. The projection p is an eps-embedding as defined by: (1 - eps) ||u - v||^2 < ||p(u) - p(v)||^2 < (1 + eps) ||u - v||^2 Where u and v are any rows taken from a dataset of shape [n_sam

manifold.MDS()

class sklearn.manifold.MDS(n_components=2, metric=True, n_init=4, max_iter=300, verbose=0, eps=0.001, n_jobs=1, random_state=None, dissimilarity='euclidean') [source] Multidimensional scaling Read more in the User Guide. Parameters: metric : boolean, optional, default: True compute metric or nonmetric SMACOF (Scaling by Majorizing a Complicated Function) algorithm n_components : int, optional, default: 2 number of dimension in which to immerse the similarities overridden if initial arra

sklearn.datasets.load_breast_cancer()

sklearn.datasets.load_breast_cancer(return_X_y=False) [source] Load and return the breast cancer wisconsin dataset (classification). The breast cancer dataset is a classic and very easy binary classification dataset. Classes 2 Samples per class 212(M),357(B) Samples total 569 Dimensionality 30 Features real, positive Parameters: return_X_y : boolean, default=False If True, returns (data, target) instead of a Bunch object. See below for more information about the data and target object. N

Label Propagation digits active learning

Demonstrates an active learning technique to learn handwritten digits using label propagation. We start by training a label propagation model with only 10 labeled points, then we select the top five most uncertain points to label. Next, we train with 15 labeled points (original 10 + 5 new ones). We repeat this process four times to have a model trained with 30 labeled examples. A plot will appear showing the top 5 most uncertain digits for each iteration of training. These may or may not conta

exceptions.DataConversionWarning

class sklearn.exceptions.DataConversionWarning [source] Warning used to notify implicit data conversions happening in the code. This warning occurs when some input data needs to be converted or interpreted in a way that may not match the user?s expectations. For example, this warning may occur when the user passes an integer array to a function which expects float input and will convert the input requests a non-copying operation, but a copy is required to meet the implementation?s data-typ

Plot the decision boundaries of a VotingClassifier

Plot the decision boundaries of a VotingClassifier for two features of the Iris dataset. Plot the class probabilities of the first sample in a toy dataset predicted by three different classifiers and averaged by the VotingClassifier. First, three exemplary classifiers are initialized (DecisionTreeClassifier, KNeighborsClassifier, and SVC) and used to initialize a soft-voting VotingClassifier with weights [2, 1, 2], which means that the predicted probabilities of the DecisionTreeClassifier and

SVM-Kernels

Three different types of SVM-Kernels are displayed below. The polynomial and RBF are especially useful when the data-points are not linearly separable. print(__doc__) # Code source: Ga Varoquaux # License: BSD 3 clause import numpy as np import matplotlib.pyplot as plt from sklearn import svm # Our dataset and targets X = np.c_[(.4, -.7), (-1.5, -1), (-1.4, -.9), (-1.3, -1.2), (-1.1, -.2), (-1.2, -.4), (-.5, 1.2),

sklearn.datasets.fetch_lfw_pairs()

sklearn.datasets.fetch_lfw_pairs(subset='train', data_home=None, funneled=True, resize=0.5, color=False, slice_=(slice(70, 195, None), slice(78, 172, None)), download_if_missing=True) [source] Loader for the Labeled Faces in the Wild (LFW) pairs dataset This dataset is a collection of JPEG pictures of famous people collected on the internet, all details are available on the official website: http://vis-www.cs.umass.edu/lfw/ Each picture is centered on a single face. Each pixel of each chan

random_projection.GaussianRandomProjection()

class sklearn.random_projection.GaussianRandomProjection(n_components='auto', eps=0.1, random_state=None) [source] Reduce dimensionality through Gaussian random projection The components of the random matrix are drawn from N(0, 1 / n_components). Read more in the User Guide. Parameters: n_components : int or ?auto?, optional (default = ?auto?) Dimensionality of the target projection space. n_components can be automatically adjusted according to the number of samples in the dataset and the