base.RegressorMixin

class sklearn.base.RegressorMixin [source] Mixin class for all regression estimators in scikit-learn. Methods score(X, y[, sample_weight]) Returns the coefficient of determination R^2 of the prediction. __init__() x.__init__(...) initializes x; see help(type(x)) for signature score(X, y, sample_weight=None) [source] Returns the coefficient of determination R^2 of the prediction. The coefficient R^2 is defined as (1 - u/v), where u is the regression sum of squares ((y_true - y_pred)

sklearn.metrics.pairwise.polynomial_kernel()

sklearn.metrics.pairwise.polynomial_kernel(X, Y=None, degree=3, gamma=None, coef0=1) [source] Compute the polynomial kernel between X and Y: K(X, Y) = (gamma <X, Y> + coef0)^degree Read more in the User Guide. Parameters: X : ndarray of shape (n_samples_1, n_features) Y : ndarray of shape (n_samples_2, n_features) degree : int, default 3 gamma : float, default None if None, defaults to 1.0 / n_samples_1 coef0 : int, default 1 Returns: Gram matrix : array of shape (n_samples_1, n

Plot randomly generated classification dataset

Plot several randomly generated 2D classification datasets. This example illustrates the datasets.make_classification datasets.make_blobs and datasets.make_gaussian_quantiles functions. For make_classification, three binary and two multi-class classification datasets are generated, with different numbers of informative features and clusters per class. print(__doc__) import matplotlib.pyplot as plt from sklearn.datasets import make_classification from sklearn.datasets import make_blobs fr

SVM-Anova

This example shows how to perform univariate feature selection before running a SVC (support vector classifier) to improve the classification scores. print(__doc__) import numpy as np import matplotlib.pyplot as plt from sklearn import svm, datasets, feature_selection from sklearn.model_selection import cross_val_score from sklearn.pipeline import Pipeline Import some data to play with digits = datasets.load_digits() y = digits.target # Throw away data, to be in the curse of dimension settin

linear_model.LassoLars()

class sklearn.linear_model.LassoLars(alpha=1.0, fit_intercept=True, verbose=False, normalize=True, precompute='auto', max_iter=500, eps=2.2204460492503131e-16, copy_X=True, fit_path=True, positive=False) [source] Lasso model fit with Least Angle Regression a.k.a. Lars It is a Linear Model trained with an L1 prior as regularizer. The optimization objective for Lasso is: (1 / (2 * n_samples)) * ||y - Xw||^2_2 + alpha * ||w||_1 Read more in the User Guide. Parameters: alpha : float Constant

preprocessing.StandardScaler()

class sklearn.preprocessing.StandardScaler(copy=True, with_mean=True, with_std=True) [source] Standardize features by removing the mean and scaling to unit variance Centering and scaling happen independently on each feature by computing the relevant statistics on the samples in the training set. Mean and standard deviation are then stored to be used on later data using the transform method. Standardization of a dataset is a common requirement for many machine learning estimators: they might

sklearn.metrics.recall_score()

sklearn.metrics.recall_score(y_true, y_pred, labels=None, pos_label=1, average='binary', sample_weight=None) [source] Compute the recall The recall is the ratio tp / (tp + fn) where tp is the number of true positives and fn the number of false negatives. The recall is intuitively the ability of the classifier to find all the positive samples. The best value is 1 and the worst value is 0. Read more in the User Guide. Parameters: y_true : 1d array-like, or label indicator array / sparse matr

sklearn.datasets.make_classification()

sklearn.datasets.make_classification(n_samples=100, n_features=20, n_informative=2, n_redundant=2, n_repeated=0, n_classes=2, n_clusters_per_class=2, weights=None, flip_y=0.01, class_sep=1.0, hypercube=True, shift=0.0, scale=1.0, shuffle=True, random_state=None) [source] Generate a random n-class classification problem. This initially creates clusters of points normally distributed (std=1) about vertices of a 2 * class_sep-sided hypercube, and assigns an equal number of clusters to each cla

Multi-output Decision Tree Regression

An example to illustrate multi-output regression with decision tree. The decision trees is used to predict simultaneously the noisy x and y observations of a circle given a single underlying feature. As a result, it learns local linear regressions approximating the circle. We can see that if the maximum depth of the tree (controlled by the max_depth parameter) is set too high, the decision trees learn too fine details of the training data and learn from the noise, i.e. they overfit. print(

sklearn.datasets.load_sample_image()

sklearn.datasets.load_sample_image(image_name) [source] Load the numpy array of a single sample image Parameters: image_name: {`china.jpg`, `flower.jpg`} : The name of the sample image loaded Returns: img: 3D array : The image as a numpy array: height x width x color Examples >>> from sklearn.datasets import load_sample_image >>> china = load_sample_image('china.jpg') >>> china.dtype dtype('uint8') >>> china.shape