class sklearn.linear_model.LassoLarsCV(fit_intercept=True, verbose=False, max_iter=500, normalize=True, precompute='auto', cv=None, max_n_alphas=1000, n_jobs=1, eps=2.2204460492503131e-16, copy_X=True, positive=False) [source] Cross-validated Lasso, using the LARS algorithm The optimization objective for Lasso is: (1 / (2 * n_samples)) * ||y - Xw||^2_2 + alpha * ||w||_1 Read more in the User Guide. Parameters: fit_intercept : boolean whether to calculate the intercept for this model. If

sklearn.metrics.accuracy_score(y_true, y_pred, normalize=True, sample_weight=None) [source] Accuracy classification score. In multilabel classification, this function computes subset accuracy: the set of labels predicted for a sample must exactly match the corresponding set of labels in y_true. Read more in the User Guide. Parameters: y_true : 1d array-like, or label indicator array / sparse matrix Ground truth (correct) labels. y_pred : 1d array-like, or label indicator array / sparse m

Explicit feature map approximation for RBF kernels

An example illustrating the approximation of the feature map of an RBF kernel. It shows how to use RBFSampler and Nystroem to approximate the feature map of an RBF kernel for classification with an SVM on the digits dataset. Results using a linear SVM in the original space, a linear SVM using the approximate mappings and using a kernelized SVM are compared. Timings and accuracy for varying amounts of Monte Carlo samplings (in the case of RBFSampler, which uses random Fourier features) and diff

Receiver Operating Characteristic with cross validation

Example of Receiver Operating Characteristic (ROC) metric to evaluate classifier output quality using cross-validation. ROC curves typically feature true positive rate on the Y axis, and false positive rate on the X axis. This means that the top left corner of the plot is the ?ideal? point - a false positive rate of zero, and a true positive rate of one. This is not very realistic, but it does mean that a larger area under the curve (AUC) is usually better. The ?steepness? of ROC curves is als

Digits Classification Exercise

A tutorial exercise regarding the use of classification techniques on the Digits dataset. This exercise is used in the Classification part of the Supervised learning: predicting an output variable from high-dimensional observations section of the A tutorial on statistical-learning for scientific data processing. print(__doc__) from sklearn import datasets, neighbors, linear_model digits = datasets.load_digits() X_digits = digits.data y_digits = digits.target n_samples = len(X_digits) X_tra

exceptions.UndefinedMetricWarning

class sklearn.exceptions.UndefinedMetricWarning [source] Warning used when the metric is invalid Changed in version 0.18: Moved from sklearn.base.

cluster.bicluster.SpectralCoclustering()

class sklearn.cluster.bicluster.SpectralCoclustering(n_clusters=3, svd_method='randomized', n_svd_vecs=None, mini_batch=False, init='k-means++', n_init=10, n_jobs=1, random_state=None) [source] Spectral Co-Clustering algorithm (Dhillon, 2001). Clusters rows and columns of an array X to solve the relaxed normalized cut of the bipartite graph created from X as follows: the edge between row vertex i and column vertex j has weight X[i, j]. The resulting bicluster structure is block-diagonal, si

preprocessing.MultiLabelBinarizer()

class sklearn.preprocessing.MultiLabelBinarizer(classes=None, sparse_output=False) [source] Transform between iterable of iterables and a multilabel format Although a list of sets or tuples is a very intuitive format for multilabel data, it is unwieldy to process. This transformer converts between this intuitive format and the supported multilabel format: a (samples x classes) binary matrix indicating the presence of a class label. Parameters: classes : array-like of shape [n_classes] (opt

cluster.bicluster.SpectralBiclustering()

class sklearn.cluster.bicluster.SpectralBiclustering(n_clusters=3, method='bistochastic', n_components=6, n_best=3, svd_method='randomized', n_svd_vecs=None, mini_batch=False, init='k-means++', n_init=10, n_jobs=1, random_state=None) [source] Spectral biclustering (Kluger, 2003). Partitions rows and columns under the assumption that the data has an underlying checkerboard structure. For instance, if there are two row partitions and three column partitions, each row will belong to three bicl

A demo of structured Ward hierarchical clustering on a raccoon face image

Compute the segmentation of a 2D image with Ward hierarchical clustering. The clustering is spatially constrained in order for each segmented region to be in one piece. # Author : Vincent Michel, 2010 # Alexandre Gramfort, 2011 # License: BSD 3 clause print(__doc__) import time as time import numpy as np import scipy as sp import matplotlib.pyplot as plt from sklearn.feature_extraction.image import grid_to_graph from sklearn.cluster import AgglomerativeClustering from sklearn.uti