kernel_approximation.RBFSampler()

class sklearn.kernel_approximation.RBFSampler(gamma=1.0, n_components=100, random_state=None) [source] Approximates feature map of an RBF kernel by Monte Carlo approximation of its Fourier transform. It implements a variant of Random Kitchen Sinks.[1] Read more in the User Guide. Parameters: gamma : float Parameter of RBF kernel: exp(-gamma * x^2) n_components : int Number of Monte Carlo samples per original feature. Equals the dimensionality of the computed feature space. random_state

Precision-Recall

Example of Precision-Recall metric to evaluate classifier output quality. In information retrieval, precision is a measure of result relevancy, while recall is a measure of how many truly relevant results are returned. A high area under the curve represents both high recall and high precision, where high precision relates to a low false positive rate, and high recall relates to a low false negative rate. High scores for both show that the classifier is returning accurate results (high precisio

cross_validation.KFold()

Warning DEPRECATED class sklearn.cross_validation.KFold(n, n_folds=3, shuffle=False, random_state=None) [source] K-Folds cross validation iterator. Deprecated since version 0.18: This module will be removed in 0.20. Use sklearn.model_selection.KFold instead. Provides train/test indices to split data in train test sets. Split dataset into k consecutive folds (without shuffling by default). Each fold is then used as a validation set once while the k - 1 remaining fold(s) form the training

feature_selection.SelectKBest()

class sklearn.feature_selection.SelectKBest(score_func=, k=10) [source] Select features according to the k highest scores. Read more in the User Guide. Parameters: score_func : callable Function taking two arrays X and y, and returning a pair of arrays (scores, pvalues) or a single array with scores. Default is f_classif (see below ?See also?). The default function only works with classification tasks. k : int or ?all?, optional, default=10 Number of top features to select. The ?all? op

neural_network.BernoulliRBM()

class sklearn.neural_network.BernoulliRBM(n_components=256, learning_rate=0.1, batch_size=10, n_iter=10, verbose=0, random_state=None) [source] Bernoulli Restricted Boltzmann Machine (RBM). A Restricted Boltzmann Machine with binary visible units and binary hidden units. Parameters are estimated using Stochastic Maximum Likelihood (SML), also known as Persistent Contrastive Divergence (PCD) [2]. The time complexity of this implementation is O(d ** 2) assuming d ~ n_features ~ n_components.

preprocessing.PolynomialFeatures()

class sklearn.preprocessing.PolynomialFeatures(degree=2, interaction_only=False, include_bias=True) [source] Generate polynomial and interaction features. Generate a new feature matrix consisting of all polynomial combinations of the features with degree less than or equal to the specified degree. For example, if an input sample is two dimensional and of the form [a, b], the degree-2 polynomial features are [1, a, b, a^2, ab, b^2]. Parameters: degree : integer The degree of the polynomial

gaussian_process.kernels.WhiteKernel()

class sklearn.gaussian_process.kernels.WhiteKernel(noise_level=1.0, noise_level_bounds=(1e-05, 100000.0)) [source] White kernel. The main use-case of this kernel is as part of a sum-kernel where it explains the noise-component of the signal. Tuning its parameter corresponds to estimating the noise-level. k(x_1, x_2) = noise_level if x_1 == x_2 else 0 New in version 0.18. Parameters: noise_level : float, default: 1.0 Parameter controlling the noise level noise_level_bounds : pair of flo

sklearn.metrics.v_measure_score()

sklearn.metrics.v_measure_score(labels_true, labels_pred) [source] V-measure cluster labeling given a ground truth. This score is identical to normalized_mutual_info_score. The V-measure is the harmonic mean between homogeneity and completeness: v = 2 * (homogeneity * completeness) / (homogeneity + completeness) This metric is independent of the absolute values of the labels: a permutation of the class or cluster label values won?t change the score value in any way. This metric is furtherm

feature_selection.RFECV()

class sklearn.feature_selection.RFECV(estimator, step=1, cv=None, scoring=None, verbose=0, n_jobs=1) [source] Feature ranking with recursive feature elimination and cross-validated selection of the best number of features. Read more in the User Guide. Parameters: estimator : object A supervised learning estimator with a fit method that updates a coef_ attribute that holds the fitted parameters. Important features must correspond to high absolute values in the coef_ array. For instance, th

sklearn.datasets.load_iris()

sklearn.datasets.load_iris(return_X_y=False) [source] Load and return the iris dataset (classification). The iris dataset is a classic and very easy multi-class classification dataset. Classes 3 Samples per class 50 Samples total 150 Dimensionality 4 Features real, positive Read more in the User Guide. Parameters: return_X_y : boolean, default=False. If True, returns (data, target) instead of a Bunch object. See below for more information about the data and target object. New in version