sklearn.calibration.calibration_curve()

sklearn.calibration.calibration_curve(y_true, y_prob, normalize=False, n_bins=5) [source] Compute true and predicted probabilities for a calibration curve. Read more in the User Guide. Parameters: y_true : array, shape (n_samples,) True targets. y_prob : array, shape (n_samples,) Probabilities of the positive class. normalize : bool, optional, default=False Whether y_prob needs to be normalized into the bin [0, 1], i.e. is not a proper probability. If True, the smallest value in y_pro

Lasso and Elastic Net

Lasso and elastic net (L1 and L2 penalisation) implemented using a coordinate descent. The coefficients can be forced to be positive. Out: Computing regularization path using the lasso... Computing regularization path using the positive lasso... Computing regularization path using the elastic net... Computing regularization path using the positive elastic net... print(__doc__) # Author: Alexandre Gramfort <alexandre.gramfort@inria.fr> # License: BSD 3 clause from itertools

Cross-validation on Digits Dataset Exercise

A tutorial exercise using Cross-validation with an SVM on the Digits dataset. This exercise is used in the Cross-validation generators part of the Model selection: choosing estimators and their parameters section of the A tutorial on statistical-learning for scientific data processing. print(__doc__) import numpy as np from sklearn.model_selection import cross_val_score from sklearn import datasets, svm digits = datasets.load_digits() X = digits.data y = digits.target svc = svm.SVC(ker

FeatureHasher and DictVectorizer Comparison

Compares FeatureHasher and DictVectorizer by using both to vectorize text documents. The example demonstrates syntax and speed only; it doesn?t actually do anything useful with the extracted vectors. See the example scripts {document_classification_20newsgroups,clustering}.py for actual learning on text documents. A discrepancy between the number of terms reported for DictVectorizer and for FeatureHasher is to be expected due to hash collisions. # Author: Lars Buitinck # License: BSD 3 clause

sklearn.tree.export_graphviz()

sklearn.tree.export_graphviz() [source] Export a decision tree in DOT format. This function generates a GraphViz representation of the decision tree, which is then written into out_file. Once exported, graphical renderings can be generated using, for example: $ dot -Tps tree.dot -o tree.ps (PostScript format) $ dot -Tpng tree.dot -o tree.png (PNG format) The sample counts that are shown are weighted with any sample_weights that might be present. Read more in the User Guide. Paramet

sklearn.feature_selection.mutual_info_classif()

sklearn.feature_selection.mutual_info_classif(X, y, discrete_features='auto', n_neighbors=3, copy=True, random_state=None) [source] Estimate mutual information for a discrete target variable. Mutual information (MI) [R169] between two random variables is a non-negative value, which measures the dependency between the variables. It is equal to zero if and only if two random variables are independent, and higher values mean higher dependency. The function relies on nonparametric methods based

Lasso path using LARS

Computes Lasso Path along the regularization parameter using the LARS algorithm on the diabetes dataset. Each color represents a different feature of the coefficient vector, and this is displayed as a function of the regularization parameter. Out: Computing regularization path using the LARS ... . print(__doc__) # Author: Fabian Pedregosa <fabian.pedregosa@inria.fr> # Alexandre Gramfort <alexandre.gramfort@inria.fr> # License: BSD 3 clause import numpy as np impor

random_projection.GaussianRandomProjection()

class sklearn.random_projection.GaussianRandomProjection(n_components='auto', eps=0.1, random_state=None) [source] Reduce dimensionality through Gaussian random projection The components of the random matrix are drawn from N(0, 1 / n_components). Read more in the User Guide. Parameters: n_components : int or ?auto?, optional (default = ?auto?) Dimensionality of the target projection space. n_components can be automatically adjusted according to the number of samples in the dataset and the

sklearn.datasets.fetch_lfw_pairs()

sklearn.datasets.fetch_lfw_pairs(subset='train', data_home=None, funneled=True, resize=0.5, color=False, slice_=(slice(70, 195, None), slice(78, 172, None)), download_if_missing=True) [source] Loader for the Labeled Faces in the Wild (LFW) pairs dataset This dataset is a collection of JPEG pictures of famous people collected on the internet, all details are available on the official website: http://vis-www.cs.umass.edu/lfw/ Each picture is centered on a single face. Each pixel of each chan

exceptions.DataConversionWarning

class sklearn.exceptions.DataConversionWarning [source] Warning used to notify implicit data conversions happening in the code. This warning occurs when some input data needs to be converted or interpreted in a way that may not match the user?s expectations. For example, this warning may occur when the user passes an integer array to a function which expects float input and will convert the input requests a non-copying operation, but a copy is required to meet the implementation?s data-typ