cross_validation.LeaveOneLabelOut()

Warning DEPRECATED class sklearn.cross_validation.LeaveOneLabelOut(labels) [source] Leave-One-Label_Out cross-validation iterator Deprecated since version 0.18: This module will be removed in 0.20. Use sklearn.model_selection.LeaveOneGroupOut instead. Provides train/test indices to split data according to a third-party provided label. This label information can be used to encode arbitrary domain specific stratifications of the samples as integers. For instance the labels could be the ye

Comparing various online solvers

An example showing how different online solvers perform on the hand-written digits dataset. Out: training SGD training ASGD training Perceptron training Passive-Aggressive I training Passive-Aggressive II training SAG # Author: Rob Zinkov <rob at zinkov dot com> # License: BSD 3 clause import numpy as np import matplotlib.pyplot as plt from sklearn import datasets from sklearn.model_selection import train_test_split from sklearn.linear_model import SGDClassifier, Perceptron fro

Using FunctionTransformer to select columns

Shows how to use a function transformer in a pipeline. If you know your dataset?s first principle component is irrelevant for a classification task, you can use the FunctionTransformer to select all but the first column of the PCA transformed data. import matplotlib.pyplot as plt import numpy as np from sklearn.model_selection import train_test_split from sklearn.decomposition import PCA from sklearn.pipeline import make_pipeline from sklearn.preprocessing import FunctionTransformer d

Feature importances with forests of trees

This examples shows the use of forests of trees to evaluate the importance of features on an artificial classification task. The red bars are the feature importances of the forest, along with their inter-trees variability. As expected, the plot suggests that 3 features are informative, while the remaining are not. Out: Feature ranking: 1. feature 1 (0.295902) 2. feature 2 (0.208351) 3. feature 0 (0.177632) 4. feature 3 (0.047121) 5. feature 6 (0.046303) 6. feature 8 (0.046013) 7. feature

Plot the decision surfaces of ensembles of trees on the iris dataset

Plot the decision surfaces of forests of randomized trees trained on pairs of features of the iris dataset. This plot compares the decision surfaces learned by a decision tree classifier (first column), by a random forest classifier (second column), by an extra- trees classifier (third column) and by an AdaBoost classifier (fourth column). In the first row, the classifiers are built using the sepal width and the sepal length features only, on the second row using the petal length and sepal len

base.BaseEstimator

class sklearn.base.BaseEstimator [source] Base class for all estimators in scikit-learn Notes All estimators should specify all the parameters that can be set at the class level in their __init__ as explicit keyword arguments (no *args or **kwargs). Methods get_params([deep]) Get parameters for this estimator. set_params(\*\*params) Set the parameters of this estimator. __init__() x.__init__(...) initializes x; see help(type(x)) for signature get_params(deep=True) [source] Get para

sklearn.svm.libsvm.predict_proba()

sklearn.svm.libsvm.predict_proba() Predict probabilities svm_model stores all parameters needed to predict a given value. For speed, all real work is done at the C level in function copy_predict (libsvm_helper.c). We have to reconstruct model and parameters to make sure we stay in sync with the python object. See sklearn.svm.predict for a complete list of parameters. Parameters: X: array-like, dtype=float : kernel : {?linear?, ?rbf?, ?poly?, ?sigmoid?, ?precomputed?} Returns: dec_values

cross_validation.LabelKFold()

Warning DEPRECATED class sklearn.cross_validation.LabelKFold(labels, n_folds=3) [source] K-fold iterator variant with non-overlapping labels. Deprecated since version 0.18: This module will be removed in 0.20. Use sklearn.model_selection.GroupKFold instead. The same label will not appear in two different folds (the number of distinct labels has to be at least equal to the number of folds). The folds are approximately balanced in the sense that the number of distinct labels is approximat

grid_search.ParameterSampler()

Warning DEPRECATED class sklearn.grid_search.ParameterSampler(param_distributions, n_iter, random_state=None) [source] Generator on parameters sampled from given distributions. Deprecated since version 0.18: This module will be removed in 0.20. Use sklearn.model_selection.ParameterSampler instead. Non-deterministic iterable over random candidate combinations for hyper- parameter search. If all parameters are presented as a list, sampling without replacement is performed. If at least one

sklearn.metrics.pairwise_distances_argmin_min()

sklearn.metrics.pairwise_distances_argmin_min(X, Y, axis=1, metric='euclidean', batch_size=500, metric_kwargs=None) [source] Compute minimum distances between one point and a set of points. This function computes for each row in X, the index of the row of Y which is closest (according to the specified distance). The minimal distances are also returned. This is mostly equivalent to calling: (pairwise_distances(X, Y=Y, metric=metric).argmin(axis=axis), pairwise_distances(X, Y=Y, metric=metri