cross_validation.LeavePOut()

Warning DEPRECATED class sklearn.cross_validation.LeavePOut(n, p) [source] Leave-P-Out cross validation iterator Deprecated since version 0.18: This module will be removed in 0.20. Use sklearn.model_selection.LeavePOut instead. Provides train/test indices to split data in train test sets. This results in testing on all distinct samples of size p, while the remaining n - p samples form the training set in each iteration. Note: LeavePOut(n, p) is NOT equivalent to KFold(n, n_folds=n // p)

SVM Exercise

A tutorial exercise for using different SVM kernels. This exercise is used in the Using kernels part of the Supervised learning: predicting an output variable from high-dimensional observations section of the A tutorial on statistical-learning for scientific data processing. print(__doc__) import numpy as np import matplotlib.pyplot as plt from sklearn import datasets, svm iris = datasets.load_iris() X = iris.data y = iris.target X = X[y != 0, :2] y = y[y != 0] n_sample = len(X)

sklearn.datasets.make_moons()

sklearn.datasets.make_moons(n_samples=100, shuffle=True, noise=None, random_state=None) [source] Make two interleaving half circles A simple toy dataset to visualize clustering and classification algorithms. Read more in the User Guide. Parameters: n_samples : int, optional (default=100) The total number of points generated. shuffle : bool, optional (default=True) Whether to shuffle the samples. noise : double or None (default=None) Standard deviation of Gaussian noise added to the da

kernel_ridge.KernelRidge()

class sklearn.kernel_ridge.KernelRidge(alpha=1, kernel='linear', gamma=None, degree=3, coef0=1, kernel_params=None) [source] Kernel ridge regression. Kernel ridge regression (KRR) combines ridge regression (linear least squares with l2-norm regularization) with the kernel trick. It thus learns a linear function in the space induced by the respective kernel and the data. For non-linear kernels, this corresponds to a non-linear function in the original space. The form of the model learned by

linear_model.MultiTaskLasso()

class sklearn.linear_model.MultiTaskLasso(alpha=1.0, fit_intercept=True, normalize=False, copy_X=True, max_iter=1000, tol=0.0001, warm_start=False, random_state=None, selection='cyclic') [source] Multi-task Lasso model trained with L1/L2 mixed-norm as regularizer The optimization objective for Lasso is: (1 / (2 * n_samples)) * ||Y - XW||^2_Fro + alpha * ||W||_21 Where: ||W||_21 = \sum_i \sqrt{\sum_j w_{ij}^2} i.e. the sum of norm of each row. Read more in the User Guide. Parameters: alph

sklearn.feature_selection.chi2()

sklearn.feature_selection.chi2(X, y) [source] Compute chi-squared stats between each non-negative feature and class. This score can be used to select the n_features features with the highest values for the test chi-squared statistic from X, which must contain only non-negative features such as booleans or frequencies (e.g., term counts in document classification), relative to the classes. Recall that the chi-square test measures dependence between stochastic variables, so using this functio

Kernel PCA

This example shows that Kernel PCA is able to find a projection of the data that makes data linearly separable. print(__doc__) # Authors: Mathieu Blondel # Andreas Mueller # License: BSD 3 clause import numpy as np import matplotlib.pyplot as plt from sklearn.decomposition import PCA, KernelPCA from sklearn.datasets import make_circles np.random.seed(0) X, y = make_circles(n_samples=400, factor=.3, noise=.05) kpca = KernelPCA(kernel="rbf", fit_inverse_transform=True, gamma=1

sklearn.metrics.label_ranking_loss()

sklearn.metrics.label_ranking_loss(y_true, y_score, sample_weight=None) [source] Compute Ranking loss measure Compute the average number of label pairs that are incorrectly ordered given y_score weighted by the size of the label set and the number of labels not in the label set. This is similar to the error set size, but weighted by the number of relevant and irrelevant labels. The best performance is achieved with a ranking loss of zero. Read more in the User Guide. New in version 0.17: A

sklearn.datasets.fetch_lfw_pairs()

sklearn.datasets.fetch_lfw_pairs(subset='train', data_home=None, funneled=True, resize=0.5, color=False, slice_=(slice(70, 195, None), slice(78, 172, None)), download_if_missing=True) [source] Loader for the Labeled Faces in the Wild (LFW) pairs dataset This dataset is a collection of JPEG pictures of famous people collected on the internet, all details are available on the official website: http://vis-www.cs.umass.edu/lfw/ Each picture is centered on a single face. Each pixel of each chan

sklearn.metrics.median_absolute_error()

sklearn.metrics.median_absolute_error(y_true, y_pred) [source] Median absolute error regression loss Read more in the User Guide. Parameters: y_true : array-like of shape = (n_samples) Ground truth (correct) target values. y_pred : array-like of shape = (n_samples) Estimated target values. Returns: loss : float A positive floating point value (the best value is 0.0). Examples >>> from sklearn.metrics import median_absolute_error >>> y_true = [3, -0.5, 2, 7] >&