sklearn.tree.export_graphviz() [source] Export a decision tree in DOT format. This function generates a GraphViz representation of the decision tree, which is then written into out_file. Once exported, graphical renderings can be generated using, for example: $ dot -Tps tree.dot -o tree.ps (PostScript format) $ dot -Tpng tree.dot -o tree.png (PNG format) The sample counts that are shown are weighted with any sample_weights that might be present. Read more in the User Guide. Paramet

Datasets Scikit-learn deals with learning information from one or more datasets that are represented as 2D arrays. They can be understood as a list of multi-dimensional observations. We say that the first axis of these arrays is the samples axis, while the second is the features axis. A simple example shipped with the scikit: iris dataset >>> from sklearn import datasets >>> iris = datasets.load_iris() >>> data = iris.data >>> data.shape (150, 4) It is ma

exceptions.ConvergenceWarning

class sklearn.exceptions.ConvergenceWarning [source] Custom warning to capture convergence problems Changed in version 0.18: Moved from sklearn.utils. Examples using sklearn.exceptions.ConvergenceWarning Sparse recovery: feature selection for sparse linear models

exceptions.EfficiencyWarning

class sklearn.exceptions.EfficiencyWarning [source] Warning used to notify the user of inefficient computation. This warning notifies the user that the efficiency may not be optimal due to some reason which may be included as a part of the warning message. This may be subclassed into a more specific Warning class. New in version 0.18.

sklearn.metrics.pairwise.cosine_similarity()

sklearn.metrics.pairwise.cosine_similarity(X, Y=None, dense_output=True) [source] Compute cosine similarity between samples in X and Y. Cosine similarity, or the cosine kernel, computes similarity as the normalized dot product of X and Y: K(X, Y) = <X, Y> / (||X||*||Y||) On L2-normalized data, this function is equivalent to linear_kernel. Read more in the User Guide. Parameters: X : ndarray or sparse array, shape: (n_samples_X, n_features) Input data. Y : ndarray or sparse array,

Feature agglomeration

These images how similar features are merged together using feature agglomeration. print(__doc__) # Code source: Ga Varoquaux # Modified for documentation by Jaques Grobler # License: BSD 3 clause import numpy as np import matplotlib.pyplot as plt from sklearn import datasets, cluster from sklearn.feature_extraction.image import grid_to_graph digits = datasets.load_digits() images = digits.images X = np.reshape(images, (len(images), -1)) connectivity = grid_to_graph(*images[0].shape)

Support Vector Regression using linear and non-linear kernels

Toy example of 1D regression using linear, polynomial and RBF kernels. print(__doc__) import numpy as np from sklearn.svm import SVR import matplotlib.pyplot as plt Generate sample data X = np.sort(5 * np.random.rand(40, 1), axis=0) y = np.sin(X).ravel() Add noise to targets y[::5] += 3 * (0.5 - np.random.rand(8)) Fit regression model svr_rbf = SVR(kernel='rbf', C=1e3, gamma=0.1) svr_lin = SVR(kernel='linear', C=1e3) svr_poly = SVR(kernel='poly', C=1e3, degree=2) y_rbf = svr_rbf.fit(X, y).

sklearn.grid_search.fit_grid_point()

Warning DEPRECATED sklearn.grid_search.fit_grid_point(X, y, estimator, parameters, train, test, scorer, verbose, error_score='raise', **fit_params) [source] Run fit on one set of parameters. Deprecated since version 0.18: This module will be removed in 0.20. Use sklearn.model_selection.fit_grid_point instead. Parameters: X : array-like, sparse matrix or list Input data. y : array-like or None Targets for input data. estimator : estimator object A object of that type is instantiate

sklearn.preprocessing.add_dummy_feature()

sklearn.preprocessing.add_dummy_feature(X, value=1.0) [source] Augment dataset with an additional dummy feature. This is useful for fitting an intercept term with implementations which cannot otherwise fit it directly. Parameters: X : {array-like, sparse matrix}, shape [n_samples, n_features] Data. value : float Value to use for the dummy feature. Returns: X : {array, sparse matrix}, shape [n_samples, n_features + 1] Same data with dummy feature added as first column. Examples >

Lasso path using LARS

Computes Lasso Path along the regularization parameter using the LARS algorithm on the diabetes dataset. Each color represents a different feature of the coefficient vector, and this is displayed as a function of the regularization parameter. Out: Computing regularization path using the LARS ... . print(__doc__) # Author: Fabian Pedregosa <fabian.pedregosa@inria.fr> # Alexandre Gramfort <alexandre.gramfort@inria.fr> # License: BSD 3 clause import numpy as np impor