Scaling the regularization parameter for SVCs

The following example illustrates the effect of scaling the regularization parameter when using Support Vector Machines for classification. For SVC classification, we are interested in a risk minimization for the equation: where is used to set the amount of regularization is a loss function of our samples and our model parameters. is a penalty function of our model parameters If we consider the loss function to be the individual error per sample, then the data-fit term, or the sum

sklearn.datasets.clear_data_home()

sklearn.datasets.clear_data_home(data_home=None) [source] Delete all the content of the data home cache.

Gradient Boosting regularization

Illustration of the effect of different regularization strategies for Gradient Boosting. The example is taken from Hastie et al 2009. The loss function used is binomial deviance. Regularization via shrinkage (learning_rate < 1.0) improves performance considerably. In combination with shrinkage, stochastic gradient boosting (subsample < 1.0) can produce more accurate models by reducing the variance via bagging. Subsampling without shrinkage usually does poorly. Another strategy to reduce

Recursive feature elimination with cross-validation

A recursive feature elimination example with automatic tuning of the number of features selected with cross-validation. Out: Optimal number of features : 3 print(__doc__) import matplotlib.pyplot as plt from sklearn.svm import SVC from sklearn.model_selection import StratifiedKFold from sklearn.feature_selection import RFECV from sklearn.datasets import make_classification # Build a classification task using 3 informative features X, y = make_classification(n_samples=1000, n_features=2

Concatenating multiple feature extraction methods

In many real-world examples, there are many ways to extract features from a dataset. Often it is beneficial to combine several methods to obtain good performance. This example shows how to use FeatureUnion to combine features obtained by PCA and univariate selection. Combining features using this transformer has the benefit that it allows cross validation and grid searches over the whole process. The combination used in this example is not particularly helpful on this dataset and is only used

sklearn.metrics.mean_absolute_error()

sklearn.metrics.mean_absolute_error(y_true, y_pred, sample_weight=None, multioutput='uniform_average') [source] Mean absolute error regression loss Read more in the User Guide. Parameters: y_true : array-like of shape = (n_samples) or (n_samples, n_outputs) Ground truth (correct) target values. y_pred : array-like of shape = (n_samples) or (n_samples, n_outputs) Estimated target values. sample_weight : array-like of shape = (n_samples), optional Sample weights. multioutput : string i

sklearn.svm.libsvm.predict_proba()

sklearn.svm.libsvm.predict_proba() Predict probabilities svm_model stores all parameters needed to predict a given value. For speed, all real work is done at the C level in function copy_predict (libsvm_helper.c). We have to reconstruct model and parameters to make sure we stay in sync with the python object. See sklearn.svm.predict for a complete list of parameters. Parameters: X: array-like, dtype=float : kernel : {?linear?, ?rbf?, ?poly?, ?sigmoid?, ?precomputed?} Returns: dec_values

grid_search.ParameterSampler()

Warning DEPRECATED class sklearn.grid_search.ParameterSampler(param_distributions, n_iter, random_state=None) [source] Generator on parameters sampled from given distributions. Deprecated since version 0.18: This module will be removed in 0.20. Use sklearn.model_selection.ParameterSampler instead. Non-deterministic iterable over random candidate combinations for hyper- parameter search. If all parameters are presented as a list, sampling without replacement is performed. If at least one

sklearn.metrics.pairwise_distances_argmin_min()

sklearn.metrics.pairwise_distances_argmin_min(X, Y, axis=1, metric='euclidean', batch_size=500, metric_kwargs=None) [source] Compute minimum distances between one point and a set of points. This function computes for each row in X, the index of the row of Y which is closest (according to the specified distance). The minimal distances are also returned. This is mostly equivalent to calling: (pairwise_distances(X, Y=Y, metric=metric).argmin(axis=axis), pairwise_distances(X, Y=Y, metric=metri

A demo of the Spectral Biclustering algorithm

This example demonstrates how to generate a checkerboard dataset and bicluster it using the Spectral Biclustering algorithm. The data is generated with the make_checkerboard function, then shuffled and passed to the Spectral Biclustering algorithm. The rows and columns of the shuffled matrix are rearranged to show the biclusters found by the algorithm. The outer product of the row and column label vectors shows a representation of the checkerboard structure. Out: consensus score: 1.0