class sklearn.ensemble.BaggingClassifier(base_estimator=None, n_estimators=10, max_samples=1.0, max_features=1.0, bootstrap=True, bootstrap_features=False, oob_score=False, warm_start=False, n_jobs=1, random_state=None, verbose=0) [source] A Bagging classifier. A Bagging classifier is an ensemble meta-estimator that fits base classifiers each on random subsets of the original dataset and then aggregate their individual predictions (either by voting or by averaging) to form a final predictio

class sklearn.ensemble.AdaBoostRegressor(base_estimator=None, n_estimators=50, learning_rate=1.0, loss='linear', random_state=None) [source] An AdaBoost regressor. An AdaBoost [1] regressor is a meta-estimator that begins by fitting a regressor on the original dataset and then fits additional copies of the regressor on the same dataset but where the weights of instances are adjusted according to the error of the current prediction. As such, subsequent regressors focus more on difficult case

ensemble.AdaBoostClassifier()

class sklearn.ensemble.AdaBoostClassifier(base_estimator=None, n_estimators=50, learning_rate=1.0, algorithm='SAMME.R', random_state=None) [source] An AdaBoost classifier. An AdaBoost [1] classifier is a meta-estimator that begins by fitting a classifier on the original dataset and then fits additional copies of the classifier on the same dataset but where the weights of incorrectly classified instances are adjusted such that subsequent classifiers focus more on difficult cases. This class

Empirical evaluation of the impact of k-means initialization

Evaluate the ability of k-means initializations strategies to make the algorithm convergence robust as measured by the relative standard deviation of the inertia of the clustering (i.e. the sum of distances to the nearest cluster center). The first plot shows the best inertia reached for each combination of the model (KMeans or MiniBatchKMeans) and the init method (init="random" or init="kmeans++") for increasing values of the n_init parameter that controls the number of initializations. The s

dummy.DummyRegressor()

class sklearn.dummy.DummyRegressor(strategy='mean', constant=None, quantile=None) [source] DummyRegressor is a regressor that makes predictions using simple rules. This regressor is useful as a simple baseline to compare with other (real) regressors. Do not use it for real problems. Read more in the User Guide. Parameters: strategy : str Strategy to use to generate predictions. ?mean?: always predicts the mean of the training set ?median?: always predicts the median of the training set ?q

dummy.DummyClassifier()

class sklearn.dummy.DummyClassifier(strategy='stratified', random_state=None, constant=None) [source] DummyClassifier is a classifier that makes predictions using simple rules. This classifier is useful as a simple baseline to compare with other (real) classifiers. Do not use it for real problems. Read more in the User Guide. Parameters: strategy : str, default=?stratified? Strategy to use to generate predictions. ?stratified?: generates predictions by respecting the training set?s class

discriminant_analysis.QuadraticDiscriminantAnalysis()

class sklearn.discriminant_analysis.QuadraticDiscriminantAnalysis(priors=None, reg_param=0.0, store_covariances=False, tol=0.0001) [source] Quadratic Discriminant Analysis A classifier with a quadratic decision boundary, generated by fitting class conditional densities to the data and using Bayes? rule. The model fits a Gaussian density to each class. New in version 0.17: QuadraticDiscriminantAnalysis Read more in the User Guide. Parameters: priors : array, optional, shape = [n_classes]

discriminant_analysis.LinearDiscriminantAnalysis()

class sklearn.discriminant_analysis.LinearDiscriminantAnalysis(solver='svd', shrinkage=None, priors=None, n_components=None, store_covariance=False, tol=0.0001) [source] Linear Discriminant Analysis A classifier with a linear decision boundary, generated by fitting class conditional densities to the data and using Bayes? rule. The model fits a Gaussian density to each class, assuming that all classes share the same covariance matrix. The fitted model can also be used to reduce the dimension

Discrete versus Real AdaBoost

This example is based on Figure 10.2 from Hastie et al 2009 [1] and illustrates the difference in performance between the discrete SAMME [2] boosting algorithm and real SAMME.R boosting algorithm. Both algorithms are evaluated on a binary classification task where the target Y is a non-linear function of 10 input features. Discrete SAMME AdaBoost adapts based on errors in predicted class labels whereas real SAMME.R uses the predicted class probabilities. [1] T. Hastie, R. Tibshirani and J. Fri

Digits Classification Exercise

A tutorial exercise regarding the use of classification techniques on the Digits dataset. This exercise is used in the Classification part of the Supervised learning: predicting an output variable from high-dimensional observations section of the A tutorial on statistical-learning for scientific data processing. print(__doc__) from sklearn import datasets, neighbors, linear_model digits = datasets.load_digits() X_digits = digits.data y_digits = digits.target n_samples = len(X_digits) X_tra