feature_extraction.text.HashingVectorizer()

class sklearn.feature_extraction.text.HashingVectorizer(input=u'content', encoding=u'utf-8', decode_error=u'strict', strip_accents=None, lowercase=True, preprocessor=None, tokenizer=None, stop_words=None, token_pattern=u'(?u)\b\w\w+\b', ngram_range=(1, 1), analyzer=u'word', n_features=1048576, binary=False, norm=u'l2', non_negative=False, dtype=) [source] Convert a collection of text documents to a matrix of token occurrences It turns a collection of text documents into a scipy.sparse matri

L1 Penalty and Sparsity in Logistic Regression

Comparison of the sparsity (percentage of zero coefficients) of solutions when L1 and L2 penalty are used for different values of C. We can see that large values of C give more freedom to the model. Conversely, smaller values of C constrain the model more. In the L1 penalty case, this leads to sparser solutions. We classify 8x8 images of digits into two classes: 0-4 against 5-9. The visualization shows coefficients of the models for varying C. Out: C=100.00 Sparsity with L1 penalty: 6.25

model_selection.ParameterSampler()

class sklearn.model_selection.ParameterSampler(param_distributions, n_iter, random_state=None) [source] Generator on parameters sampled from given distributions. Non-deterministic iterable over random candidate combinations for hyper- parameter search. If all parameters are presented as a list, sampling without replacement is performed. If at least one parameter is given as a distribution, sampling with replacement is used. It is highly recommended to use continuous distributions for contin

sklearn.datasets.make_gaussian_quantiles()

sklearn.datasets.make_gaussian_quantiles(mean=None, cov=1.0, n_samples=100, n_features=2, n_classes=3, shuffle=True, random_state=None) [source] Generate isotropic Gaussian and label samples by quantile This classification dataset is constructed by taking a multi-dimensional standard normal distribution and defining classes separated by nested concentric multi-dimensional spheres such that roughly equal numbers of samples are in each class (quantiles of the distribution). Read more in the

sklearn.covariance.graph_lasso()

sklearn.covariance.graph_lasso(emp_cov, alpha, cov_init=None, mode='cd', tol=0.0001, enet_tol=0.0001, max_iter=100, verbose=False, return_costs=False, eps=2.2204460492503131e-16, return_n_iter=False) [source] l1-penalized covariance estimator Read more in the User Guide. Parameters: emp_cov : 2D ndarray, shape (n_features, n_features) Empirical covariance from which to compute the covariance estimate. alpha : positive float The regularization parameter: the higher alpha, the more regula

Plot multinomial and One-vs-Rest Logistic Regression

Plot decision surface of multinomial and One-vs-Rest Logistic Regression. The hyperplanes corresponding to the three One-vs-Rest (OVR) classifiers are represented by the dashed lines. Out: training score : 0.995 (multinomial) training score : 0.976 (ovr) print(__doc__) # Authors: Tom Dupre la Tour <tom.dupre-la-tour@m4x.org> # License: BSD 3 clause import numpy as np import matplotlib.pyplot as plt from sklearn.datasets import make_blobs from sklearn.linear_model import Logist

covariance.OAS()

class sklearn.covariance.OAS(store_precision=True, assume_centered=False) [source] Oracle Approximating Shrinkage Estimator Read more in the User Guide. OAS is a particular form of shrinkage described in ?Shrinkage Algorithms for MMSE Covariance Estimation? Chen et al., IEEE Trans. on Sign. Proc., Volume 58, Issue 10, October 2010. The formula used here does not correspond to the one given in the article. It has been taken from the Matlab program available from the authors? webpage (http://

multioutput.MultiOutputRegressor()

class sklearn.multioutput.MultiOutputRegressor(estimator, n_jobs=1) [source] Multi target regression This strategy consists of fitting one regressor per target. This is a simple strategy for extending regressors that do not natively support multi-target regression. Parameters: estimator : estimator object An estimator object implementing fit and predict. n_jobs : int, optional, default=1 The number of jobs to run in parallel for fit. If -1, then the number of jobs is set to the number o

Plot Ridge coefficients as a function of the L2 regularization

Ridge Regression is the estimator used in this example. Each color in the left plot represents one different dimension of the coefficient vector, and this is displayed as a function of the regularization parameter. The right plot shows how exact the solution is. This example illustrates how a well defined solution is found by Ridge regression and how regularization affects the coefficients and their values. The plot on the right shows how the difference of the coefficients from the estimator c

linear_model.Ridge()

class sklearn.linear_model.Ridge(alpha=1.0, fit_intercept=True, normalize=False, copy_X=True, max_iter=None, tol=0.001, solver='auto', random_state=None) [source] Linear least squares with l2 regularization. This model solves a regression model where the loss function is the linear least squares function and regularization is given by the l2-norm. Also known as Ridge Regression or Tikhonov regularization. This estimator has built-in support for multi-variate regression (i.e., when y is a 2d