decomposition.SparseCoder()

class sklearn.decomposition.SparseCoder(dictionary, transform_algorithm='omp', transform_n_nonzero_coefs=None, transform_alpha=None, split_sign=False, n_jobs=1) [source] Sparse coding Finds a sparse representation of data against a fixed, precomputed dictionary. Each row of the result is the solution to a sparse coding problem. The goal is to find a sparse array code such that: X ~= code * dictionary Read more in the User Guide. Parameters: dictionary : array, [n_components, n_features]

Comparison of kernel ridge regression and SVR

Both kernel ridge regression (KRR) and SVR learn a non-linear function by employing the kernel trick, i.e., they learn a linear function in the space induced by the respective kernel which corresponds to a non-linear function in the original space. They differ in the loss functions (ridge versus epsilon-insensitive loss). In contrast to SVR, fitting a KRR can be done in closed-form and is typically faster for medium-sized datasets. On the other hand, the learned model is non-sparse and thus sl

sklearn.model_selection.train_test_split()

sklearn.model_selection.train_test_split(*arrays, **options) [source] Split arrays or matrices into random train and test subsets Quick utility that wraps input validation and next(ShuffleSplit().split(X, y)) and application to input data into a single call for splitting (and optionally subsampling) data in a oneliner. Read more in the User Guide. Parameters: *arrays : sequence of indexables with same length / shape[0] Allowed inputs are lists, numpy arrays, scipy-sparse matrices or panda

calibration.CalibratedClassifierCV()

class sklearn.calibration.CalibratedClassifierCV(base_estimator=None, method='sigmoid', cv=3) [source] Probability calibration with isotonic regression or sigmoid. With this class, the base_estimator is fit on the train set of the cross-validation generator and the test set is used for calibration. The probabilities for each of the folds are then averaged for prediction. In case that cv=?prefit? is passed to __init__, it is assumed that base_estimator has been fitted already and all data is

Gaussian Mixture Model Sine Curve

This example demonstrates the behavior of Gaussian mixture models fit on data that was not sampled from a mixture of Gaussian random variables. The dataset is formed by 100 points loosely spaced following a noisy sine curve. There is therefore no ground truth value for the number of Gaussian components. The first model is a classical Gaussian Mixture Model with 10 components fit with the Expectation-Maximization algorithm. The second model is a Bayesian Gaussian Mixture Model with a Dirichlet

neighbors.KNeighborsClassifier()

class sklearn.neighbors.KNeighborsClassifier(n_neighbors=5, weights='uniform', algorithm='auto', leaf_size=30, p=2, metric='minkowski', metric_params=None, n_jobs=1, **kwargs) [source] Classifier implementing the k-nearest neighbors vote. Read more in the User Guide. Parameters: n_neighbors : int, optional (default = 5) Number of neighbors to use by default for k_neighbors queries. weights : str or callable, optional (default = ?uniform?) weight function used in prediction. Possible val

linear_model.SGDRegressor()

class sklearn.linear_model.SGDRegressor(loss='squared_loss', penalty='l2', alpha=0.0001, l1_ratio=0.15, fit_intercept=True, n_iter=5, shuffle=True, verbose=0, epsilon=0.1, random_state=None, learning_rate='invscaling', eta0=0.01, power_t=0.25, warm_start=False, average=False) [source] Linear model fitted by minimizing a regularized empirical loss with SGD SGD stands for Stochastic Gradient Descent: the gradient of the loss is estimated each sample at a time and the model is updated along th

linear_model.ARDRegression()

class sklearn.linear_model.ARDRegression(n_iter=300, tol=0.001, alpha_1=1e-06, alpha_2=1e-06, lambda_1=1e-06, lambda_2=1e-06, compute_score=False, threshold_lambda=10000.0, fit_intercept=True, normalize=False, copy_X=True, verbose=False) [source] Bayesian ARD regression. Fit the weights of a regression model, using an ARD prior. The weights of the regression model are assumed to be in Gaussian distributions. Also estimate the parameters lambda (precisions of the distributions of the weights

Visualization of MLP weights on MNIST

Sometimes looking at the learned coefficients of a neural network can provide insight into the learning behavior. For example if weights look unstructured, maybe some were not used at all, or if very large coefficients exist, maybe regularization was too low or the learning rate too high. This example shows how to plot some of the first layer weights in a MLPClassifier trained on the MNIST dataset. The input data consists of 28x28 pixel handwritten digits, leading to 784 features in the datase

Model Complexity Influence

Demonstrate how model complexity influences both prediction accuracy and computational performance. The dataset is the Boston Housing dataset (resp. 20 Newsgroups) for regression (resp. classification). For each class of models we make the model complexity vary through the choice of relevant model parameters and measure the influence on both computational performance (latency) and predictive power (MSE or Hamming Loss). print(__doc__) # Author: Eustache Diemert <eustache@diemert.fr> # L