ensemble.RandomForestRegressor()

class sklearn.ensemble.RandomForestRegressor(n_estimators=10, criterion='mse', max_depth=None, min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_features='auto', max_leaf_nodes=None, min_impurity_split=1e-07, bootstrap=True, oob_score=False, n_jobs=1, random_state=None, verbose=0, warm_start=False) [source] A random forest regressor. A random forest is a meta estimator that fits a number of classifying decision trees on various sub-samples of the dataset and use ave

sklearn.metrics.silhouette_score()

sklearn.metrics.silhouette_score(X, labels, metric='euclidean', sample_size=None, random_state=None, **kwds) [source] Compute the mean Silhouette Coefficient of all samples. The Silhouette Coefficient is calculated using the mean intra-cluster distance (a) and the mean nearest-cluster distance (b) for each sample. The Silhouette Coefficient for a sample is (b - a) / max(a, b). To clarify, b is the distance between a sample and the nearest cluster that the sample is not a part of. Note that

Nested versus non-nested cross-validation

This example compares non-nested and nested cross-validation strategies on a classifier of the iris data set. Nested cross-validation (CV) is often used to train a model in which hyperparameters also need to be optimized. Nested CV estimates the generalization error of the underlying model and its (hyper)parameter search. Choosing the parameters that maximize non-nested CV biases the model to the dataset, yielding an overly-optimistic score. Model selection without nested CV uses the same data

pipeline.FeatureUnion()

class sklearn.pipeline.FeatureUnion(transformer_list, n_jobs=1, transformer_weights=None) [source] Concatenates results of multiple transformer objects. This estimator applies a list of transformer objects in parallel to the input data, then concatenates the results. This is useful to combine several feature extraction mechanisms into a single transformer. Parameters of the transformers may be set using its name and the parameter name separated by a ?__?. A transformer may be replaced entir

preprocessing.MinMaxScaler()

class sklearn.preprocessing.MinMaxScaler(feature_range=(0, 1), copy=True) [source] Transforms features by scaling each feature to a given range. This estimator scales and translates each feature individually such that it is in the given range on the training set, i.e. between zero and one. The transformation is given by: X_std = (X - X.min(axis=0)) / (X.max(axis=0) - X.min(axis=0)) X_scaled = X_std * (max - min) + min where min, max = feature_range. This transformation is often used as an

isotonic.IsotonicRegression()

class sklearn.isotonic.IsotonicRegression(y_min=None, y_max=None, increasing=True, out_of_bounds='nan') [source] Isotonic regression model. The isotonic regression optimization problem is defined by: min sum w_i (y[i] - y_[i]) ** 2 subject to y_[i] <= y_[j] whenever X[i] <= X[j] and min(y_) = y_min, max(y_) = y_max where: y[i] are inputs (real numbers) y_[i] are fitted X specifies the order. If X is non-decreasing then y_ is non-decreasing. w[i] are optional strictly positive w

lda.LDA()

Warning DEPRECATED class sklearn.lda.LDA(solver='svd', shrinkage=None, priors=None, n_components=None, store_covariance=False, tol=0.0001) [source] Alias for sklearn.discriminant_analysis.LinearDiscriminantAnalysis. Deprecated since version 0.17: This class will be removed in 0.19. Use sklearn.discriminant_analysis.LinearDiscriminantAnalysis instead. Methods decision_function(X) Predict confidence scores for samples. fit(X, y[, store_covariance, tol]) Fit LinearDiscriminantAnalysis mo

naive_bayes.GaussianNB()

class sklearn.naive_bayes.GaussianNB(priors=None) [source] Gaussian Naive Bayes (GaussianNB) Can perform online updates to model parameters via partial_fit method. For details on algorithm used to update feature means and variance online, see Stanford CS tech report STAN-CS-79-773 by Chan, Golub, and LeVeque: http://i.stanford.edu/pub/cstr/reports/cs/tr/79/773/CS-TR-79-773.pdf Read more in the User Guide. Parameters: priors : array-like, shape (n_classes,) Prior probabilities of the clas

random_projection.SparseRandomProjection()

class sklearn.random_projection.SparseRandomProjection(n_components='auto', density='auto', eps=0.1, dense_output=False, random_state=None) [source] Reduce dimensionality through sparse random projection Sparse random matrix is an alternative to dense random projection matrix that guarantees similar embedding quality while being much more memory efficient and allowing faster computation of the projected data. If we note s = 1 / density the components of the random matrix are drawn from: -s

dummy.DummyClassifier()

class sklearn.dummy.DummyClassifier(strategy='stratified', random_state=None, constant=None) [source] DummyClassifier is a classifier that makes predictions using simple rules. This classifier is useful as a simple baseline to compare with other (real) classifiers. Do not use it for real problems. Read more in the User Guide. Parameters: strategy : str, default=?stratified? Strategy to use to generate predictions. ?stratified?: generates predictions by respecting the training set?s class