Recursive feature elimination

A recursive feature elimination example showing the relevance of pixels in a digit classification task. Note See also Recursive feature elimination with cross-validation print(__doc__) from sklearn.svm import SVC from sklearn.datasets import load_digits from sklearn.feature_selection import RFE import matplotlib.pyplot as plt # Load the digits dataset digits = load_digits() X = digits.images.reshape((len(digits.images), -1)) y = digits.target # Create the RFE object and rank each pixe

Recognizing hand-written digits

An example showing how the scikit-learn can be used to recognize images of hand-written digits. This example is commented in the tutorial section of the user manual. Out: Classification report for classifier SVC(C=1.0, cache_size=200, class_weight=None, coef0=0.0, decision_function_shape=None, degree=3, gamma=0.001, kernel='rbf', max_iter=-1, probability=False, random_state=None, shrinking=True, tol=0.001, verbose=False): precision recall f1-score support

Receiver Operating Characteristic with cross validation

Example of Receiver Operating Characteristic (ROC) metric to evaluate classifier output quality using cross-validation. ROC curves typically feature true positive rate on the Y axis, and false positive rate on the X axis. This means that the top left corner of the plot is the ?ideal? point - a false positive rate of zero, and a true positive rate of one. This is not very realistic, but it does mean that a larger area under the curve (AUC) is usually better. The ?steepness? of ROC curves is als

Receiver Operating Characteristic

Example of Receiver Operating Characteristic (ROC) metric to evaluate classifier output quality. ROC curves typically feature true positive rate on the Y axis, and false positive rate on the X axis. This means that the top left corner of the plot is the ?ideal? point - a false positive rate of zero, and a true positive rate of one. This is not very realistic, but it does mean that a larger area under the curve (AUC) is usually better. The ?steepness? of ROC curves is also important, since it i

RBF SVM parameters

This example illustrates the effect of the parameters gamma and C of the Radial Basis Function (RBF) kernel SVM. Intuitively, the gamma parameter defines how far the influence of a single training example reaches, with low values meaning ?far? and high values meaning ?close?. The gamma parameters can be seen as the inverse of the radius of influence of samples selected by the model as support vectors. The C parameter trades off misclassification of training examples against simplicity of the d

random_projection.SparseRandomProjection()

class sklearn.random_projection.SparseRandomProjection(n_components='auto', density='auto', eps=0.1, dense_output=False, random_state=None) [source] Reduce dimensionality through sparse random projection Sparse random matrix is an alternative to dense random projection matrix that guarantees similar embedding quality while being much more memory efficient and allowing faster computation of the projected data. If we note s = 1 / density the components of the random matrix are drawn from: -s

random_projection.GaussianRandomProjection()

class sklearn.random_projection.GaussianRandomProjection(n_components='auto', eps=0.1, random_state=None) [source] Reduce dimensionality through Gaussian random projection The components of the random matrix are drawn from N(0, 1 / n_components). Read more in the User Guide. Parameters: n_components : int or ?auto?, optional (default = ?auto?) Dimensionality of the target projection space. n_components can be automatically adjusted according to the number of samples in the dataset and the

qda.QDA()

Warning DEPRECATED class sklearn.qda.QDA(priors=None, reg_param=0.0, store_covariances=False, tol=0.0001) [source] Alias for sklearn.discriminant_analysis.QuadraticDiscriminantAnalysis. Deprecated since version 0.17: This class will be removed in 0.19. Use sklearn.discriminant_analysis.QuadraticDiscriminantAnalysis instead. Methods decision_function(X) Apply decision function to an array of samples. fit(X, y[, store_covariances, tol]) Fit the model according to the given training data

Putting it all together

Pipelining We have seen that some estimators can transform data and that some estimators can predict variables. We can also create combined estimators: from sklearn import linear_model, decomposition, datasets from sklearn.pipeline import Pipeline from sklearn.model_selection import GridSearchCV logistic = linear_model.LogisticRegression() pca = decomposition.PCA() pipe = Pipeline(steps=[('pca', pca), ('logistic', logistic)]) digits = datasets.load_digits() X_digits = digits.data y_di

Probability calibration of classifiers

When performing classification you often want to predict not only the class label, but also the associated probability. This probability gives you some kind of confidence on the prediction. However, not all classifiers provide well-calibrated probabilities, some being over-confident while others being under-confident. Thus, a separate calibration of predicted probabilities is often desirable as a postprocessing. This example illustrates two different methods for this calibration and evaluates