sklearn.metrics.make_scorer()

sklearn.metrics.make_scorer(score_func, greater_is_better=True, needs_proba=False, needs_threshold=False, **kwargs) [source] Make a scorer from a performance metric or loss function. This factory function wraps scoring functions for use in GridSearchCV and cross_val_score. It takes a score function, such as accuracy_score, mean_squared_error, adjusted_rand_index or average_precision and returns a callable that scores an estimator?s output. Read more in the User Guide. Parameters: score_fun

linear_model.LogisticRegression()

class sklearn.linear_model.LogisticRegression(penalty='l2', dual=False, tol=0.0001, C=1.0, fit_intercept=True, intercept_scaling=1, class_weight=None, random_state=None, solver='liblinear', max_iter=100, multi_class='ovr', verbose=0, warm_start=False, n_jobs=1) [source] Logistic Regression (aka logit, MaxEnt) classifier. In the multiclass case, the training algorithm uses the one-vs-rest (OvR) scheme if the ?multi_class? option is set to ?ovr?, and uses the cross- entropy loss if the ?multi

sklearn.metrics.pairwise.chi2_kernel()

sklearn.metrics.pairwise.chi2_kernel(X, Y=None, gamma=1.0) [source] Computes the exponential chi-squared kernel X and Y. The chi-squared kernel is computed between each pair of rows in X and Y. X and Y have to be non-negative. This kernel is most commonly applied to histograms. The chi-squared kernel is given by: k(x, y) = exp(-gamma Sum [(x - y)^2 / (x + y)]) It can be interpreted as a weighted difference per entry. Read more in the User Guide. Parameters: X : array-like of shape (n_samp

base.ClassifierMixin

class sklearn.base.ClassifierMixin [source] Mixin class for all classifiers in scikit-learn. Methods score(X, y[, sample_weight]) Returns the mean accuracy on the given test data and labels. __init__() x.__init__(...) initializes x; see help(type(x)) for signature score(X, y, sample_weight=None) [source] Returns the mean accuracy on the given test data and labels. In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample tha

Blind source separation using FastICA

An example of estimating sources from noisy data. Independent component analysis (ICA) is used to estimate sources given noisy measurements. Imagine 3 instruments playing simultaneously and 3 microphones recording the mixed signals. ICA is used to recover the sources ie. what is played by each instrument. Importantly, PCA fails at recovering our instruments since the related signals reflect non-Gaussian processes. print(__doc__) import numpy as np import matplotlib.pyplot as plt from scipy im

mixture.VBGMM()

Warning DEPRECATED class sklearn.mixture.VBGMM(*args, **kwargs) [source] Variational Inference for the Gaussian Mixture Model Deprecated since version 0.18: This class will be removed in 0.20. Use sklearn.mixture.BayesianGaussianMixture with parameter weight_concentration_prior_type='dirichlet_distribution' instead. Variational inference for a Gaussian mixture model probability distribution. This class allows for easy and efficient inference of an approximate posterior distribution over

sklearn.metrics.label_ranking_average_precision_score()

sklearn.metrics.label_ranking_average_precision_score(y_true, y_score) [source] Compute ranking-based average precision Label ranking average precision (LRAP) is the average over each ground truth label assigned to each sample, of the ratio of true vs. total labels with lower score. This metric is used in multilabel ranking problem, where the goal is to give better rank to the labels associated to each sample. The obtained score is always strictly greater than 0 and the best value is 1. Rea

linear_model.LassoLarsIC()

class sklearn.linear_model.LassoLarsIC(criterion='aic', fit_intercept=True, verbose=False, normalize=True, precompute='auto', max_iter=500, eps=2.2204460492503131e-16, copy_X=True, positive=False) [source] Lasso model fit with Lars using BIC or AIC for model selection The optimization objective for Lasso is: (1 / (2 * n_samples)) * ||y - Xw||^2_2 + alpha * ||w||_1 AIC is the Akaike information criterion and BIC is the Bayes Information criterion. Such criteria are useful to select the valu

sklearn.metrics.explained_variance_score()

sklearn.metrics.explained_variance_score(y_true, y_pred, sample_weight=None, multioutput='uniform_average') [source] Explained variance regression score function Best possible score is 1.0, lower values are worse. Read more in the User Guide. Parameters: y_true : array-like of shape = (n_samples) or (n_samples, n_outputs) Ground truth (correct) target values. y_pred : array-like of shape = (n_samples) or (n_samples, n_outputs) Estimated target values. sample_weight : array-like of shap

SVM: Weighted samples

Plot decision function of a weighted dataset, where the size of points is proportional to its weight. The sample weighting rescales the C parameter, which means that the classifier puts more emphasis on getting these points right. The effect might often be subtle. To emphasize the effect here, we particularly weight outliers, making the deformation of the decision boundary very visible. print(__doc__) import numpy as np import matplotlib.pyplot as plt from sklearn import svm def plot_de