preprocessing.FunctionTransformer()

class sklearn.preprocessing.FunctionTransformer(func=None, inverse_func=None, validate=True, accept_sparse=False, pass_y=False, kw_args=None, inv_kw_args=None) [source] Constructs a transformer from an arbitrary callable. A FunctionTransformer forwards its X (and optionally y) arguments to a user-defined function or function object and returns the result of this function. This is useful for stateless transformations such as taking the log of frequencies, doing custom scaling, etc. A Functio

FastICA on 2D point clouds

This example illustrates visually in the feature space a comparison by results using two different component analysis techniques. Independent component analysis (ICA) vs Principal component analysis (PCA). Representing ICA in the feature space gives the view of ?geometric ICA?: ICA is an algorithm that finds directions in the feature space corresponding to projections with high non-Gaussianity. These directions need not be orthogonal in the original feature space, but they are orthogonal in th

Multi-dimensional scaling

An illustration of the metric and non-metric MDS on generated noisy data. The reconstructed points using the metric MDS and non metric MDS are slightly shifted to avoid overlapping. # Author: Nelle Varoquaux <nelle.varoquaux@gmail.com> # License: BSD print(__doc__) import numpy as np from matplotlib import pyplot as plt from matplotlib.collections import LineCollection from sklearn import manifold from sklearn.metrics import euclidean_distances from sklearn.decomposition import PC

cluster.MiniBatchKMeans()

class sklearn.cluster.MiniBatchKMeans(n_clusters=8, init='k-means++', max_iter=100, batch_size=100, verbose=0, compute_labels=True, random_state=None, tol=0.0, max_no_improvement=10, init_size=None, n_init=3, reassignment_ratio=0.01) [source] Mini-Batch K-Means clustering Read more in the User Guide. Parameters: n_clusters : int, optional, default: 8 The number of clusters to form as well as the number of centroids to generate. max_iter : int, optional Maximum number of iterations over

linear_model.ARDRegression()

class sklearn.linear_model.ARDRegression(n_iter=300, tol=0.001, alpha_1=1e-06, alpha_2=1e-06, lambda_1=1e-06, lambda_2=1e-06, compute_score=False, threshold_lambda=10000.0, fit_intercept=True, normalize=False, copy_X=True, verbose=False) [source] Bayesian ARD regression. Fit the weights of a regression model, using an ARD prior. The weights of the regression model are assumed to be in Gaussian distributions. Also estimate the parameters lambda (precisions of the distributions of the weights

Gaussian Mixture Model Sine Curve

This example demonstrates the behavior of Gaussian mixture models fit on data that was not sampled from a mixture of Gaussian random variables. The dataset is formed by 100 points loosely spaced following a noisy sine curve. There is therefore no ground truth value for the number of Gaussian components. The first model is a classical Gaussian Mixture Model with 10 components fit with the Expectation-Maximization algorithm. The second model is a Bayesian Gaussian Mixture Model with a Dirichlet

neighbors.KNeighborsClassifier()

class sklearn.neighbors.KNeighborsClassifier(n_neighbors=5, weights='uniform', algorithm='auto', leaf_size=30, p=2, metric='minkowski', metric_params=None, n_jobs=1, **kwargs) [source] Classifier implementing the k-nearest neighbors vote. Read more in the User Guide. Parameters: n_neighbors : int, optional (default = 5) Number of neighbors to use by default for k_neighbors queries. weights : str or callable, optional (default = ?uniform?) weight function used in prediction. Possible val

linear_model.SGDRegressor()

class sklearn.linear_model.SGDRegressor(loss='squared_loss', penalty='l2', alpha=0.0001, l1_ratio=0.15, fit_intercept=True, n_iter=5, shuffle=True, verbose=0, epsilon=0.1, random_state=None, learning_rate='invscaling', eta0=0.01, power_t=0.25, warm_start=False, average=False) [source] Linear model fitted by minimizing a regularized empirical loss with SGD SGD stands for Stochastic Gradient Descent: the gradient of the loss is estimated each sample at a time and the model is updated along th

calibration.CalibratedClassifierCV()

class sklearn.calibration.CalibratedClassifierCV(base_estimator=None, method='sigmoid', cv=3) [source] Probability calibration with isotonic regression or sigmoid. With this class, the base_estimator is fit on the train set of the cross-validation generator and the test set is used for calibration. The probabilities for each of the folds are then averaged for prediction. In case that cv=?prefit? is passed to __init__, it is assumed that base_estimator has been fitted already and all data is

4.6. Kernel Approximation

This submodule contains functions that approximate the feature mappings that correspond to certain kernels, as they are used for example in support vector machines (see Support Vector Machines). The following feature functions perform non-linear transformations of the input, which can serve as a basis for linear classification or other algorithms. The advantage of using approximate explicit feature maps compared to the kernel trick, which makes use of feature maps implicitly, is that explicit