Plot Ridge coefficients as a function of the L2 regularization

Ridge Regression is the estimator used in this example. Each color in the left plot represents one different dimension of the coefficient vector, and this is displayed as a function of the regularization parameter. The right plot shows how exact the solution is. This example illustrates how a well defined solution is found by Ridge regression and how regularization affects the coefficients and their values. The plot on the right shows how the difference of the coefficients from the estimator c

Plot randomly generated classification dataset

Plot several randomly generated 2D classification datasets. This example illustrates the datasets.make_classification datasets.make_blobs and datasets.make_gaussian_quantiles functions. For make_classification, three binary and two multi-class classification datasets are generated, with different numbers of informative features and clusters per class. print(__doc__) import matplotlib.pyplot as plt from sklearn.datasets import make_classification from sklearn.datasets import make_blobs fr

Plot different SVM classifiers in the iris dataset

Comparison of different linear SVM classifiers on a 2D projection of the iris dataset. We only consider the first 2 features of this dataset: Sepal length Sepal width This example shows how to plot the decision surface for four SVM classifiers with different kernels. The linear models LinearSVC() and SVC(kernel='linear') yield slightly different decision boundaries. This can be a consequence of the following differences: LinearSVC minimizes the squared hinge loss while SVC minimizes the reg

Plot multi-class SGD on the iris dataset

Plot decision surface of multi-class SGD on iris dataset. The hyperplanes corresponding to the three one-versus-all (OVA) classifiers are represented by the dashed lines. print(__doc__) import numpy as np import matplotlib.pyplot as plt from sklearn import datasets from sklearn.linear_model import SGDClassifier # import some data to play with iris = datasets.load_iris() X = iris.data[:, :2] # we only take the first two features. We could # avoid this ugly slicing b

Plot multinomial and One-vs-Rest Logistic Regression

Plot decision surface of multinomial and One-vs-Rest Logistic Regression. The hyperplanes corresponding to the three One-vs-Rest (OVR) classifiers are represented by the dashed lines. Out: training score : 0.995 (multinomial) training score : 0.976 (ovr) print(__doc__) # Authors: Tom Dupre la Tour <tom.dupre-la-tour@m4x.org> # License: BSD 3 clause import numpy as np import matplotlib.pyplot as plt from sklearn.datasets import make_blobs from sklearn.linear_model import Logist

Plot classification probability

Plot the classification probability for different classifiers. We use a 3 class dataset, and we classify it with a Support Vector classifier, L1 and L2 penalized logistic regression with either a One-Vs-Rest or multinomial setting, and Gaussian process classification. The logistic regression is not a multiclass classifier out of the box. As a result it can identify only the first class. Out: classif_rate for GPC : 82.666667 classif_rate for L2 logistic (OvR) : 76.666667 classif_rate for

Pixel importances with a parallel forest of trees

This example shows the use of forests of trees to evaluate the importance of the pixels in an image classification task (faces). The hotter the pixel, the more important. The code below also illustrates how the construction and the computation of the predictions can be parallelized within multiple jobs. Out: Fitting ExtraTreesClassifier on faces data with 1 cores... done in 3.341s print(__doc__) from time import time import matplotlib.pyplot as plt from sklearn.datasets import fetch_

Plot class probabilities calculated by the VotingClassifier

Plot the class probabilities of the first sample in a toy dataset predicted by three different classifiers and averaged by the VotingClassifier. First, three examplary classifiers are initialized (LogisticRegression, GaussianNB, and RandomForestClassifier) and used to initialize a soft-voting VotingClassifier with weights [1, 1, 5], which means that the predicted probabilities of the RandomForestClassifier count 5 times as much as the weights of the other classifiers when the averaged probabil

pipeline.Pipeline()

class sklearn.pipeline.Pipeline(steps) [source] Pipeline of transforms with a final estimator. Sequentially apply a list of transforms and a final estimator. Intermediate steps of the pipeline must be ?transforms?, that is, they must implement fit and transform methods. The final estimator only needs to implement fit. The purpose of the pipeline is to assemble several steps that can be cross-validated together while setting different parameters. For this, it enables setting parameters of th

Pipelining

The PCA does an unsupervised dimensionality reduction, while the logistic regression does the prediction. We use a GridSearchCV to set the dimensionality of the PCA print(__doc__) # Code source: Ga Varoquaux # Modified for documentation by Jaques Grobler # License: BSD 3 clause import numpy as np import matplotlib.pyplot as plt from sklearn import linear_model, decomposition, datasets from sklearn.pipeline import Pipeline from sklearn.model_selection import GridSearchCV logistic = linear