Shown in the plot is how the logistic regression would, in this synthetic dataset, classify values as either 0 or 1, i.e. class one or two, using the logistic curve. print(__doc__) # Code source: Gael Varoquaux # License: BSD 3 clause import numpy as np import matplotlib.pyplot as plt from sklearn import linear_model # this is our test set, it's just a straight line with some # Gaussian noise xmin, xmax = -5, 5 n_samples = 100 np.random.seed(0) X = np.random.normal(size=n_samples) y =

Lasso and elastic net (L1 and L2 penalisation) implemented using a coordinate descent. The coefficients can be forced to be positive. Out: Computing regularization path using the lasso... Computing regularization path using the positive lasso... Computing regularization path using the elastic net... Computing regularization path using the positive elastic net... print(__doc__) # Author: Alexandre Gramfort <alexandre.gramfort@inria.fr> # License: BSD 3 clause from itertools

sklearn.metrics.median_absolute_error()

sklearn.metrics.median_absolute_error(y_true, y_pred) [source] Median absolute error regression loss Read more in the User Guide. Parameters: y_true : array-like of shape = (n_samples) Ground truth (correct) target values. y_pred : array-like of shape = (n_samples) Estimated target values. Returns: loss : float A positive floating point value (the best value is 0.0). Examples >>> from sklearn.metrics import median_absolute_error >>> y_true = [3, -0.5, 2, 7] >&

sklearn.metrics.label_ranking_loss()

sklearn.metrics.label_ranking_loss(y_true, y_score, sample_weight=None) [source] Compute Ranking loss measure Compute the average number of label pairs that are incorrectly ordered given y_score weighted by the size of the label set and the number of labels not in the label set. This is similar to the error set size, but weighted by the number of relevant and irrelevant labels. The best performance is achieved with a ranking loss of zero. Read more in the User Guide. New in version 0.17: A

SVM Exercise

A tutorial exercise for using different SVM kernels. This exercise is used in the Using kernels part of the Supervised learning: predicting an output variable from high-dimensional observations section of the A tutorial on statistical-learning for scientific data processing. print(__doc__) import numpy as np import matplotlib.pyplot as plt from sklearn import datasets, svm iris = datasets.load_iris() X = iris.data y = iris.target X = X[y != 0, :2] y = y[y != 0] n_sample = len(X)

sklearn.datasets.make_moons()

sklearn.datasets.make_moons(n_samples=100, shuffle=True, noise=None, random_state=None) [source] Make two interleaving half circles A simple toy dataset to visualize clustering and classification algorithms. Read more in the User Guide. Parameters: n_samples : int, optional (default=100) The total number of points generated. shuffle : bool, optional (default=True) Whether to shuffle the samples. noise : double or None (default=None) Standard deviation of Gaussian noise added to the da

Prediction Intervals for Gradient Boosting Regression

This example shows how quantile regression can be used to create prediction intervals. import numpy as np import matplotlib.pyplot as plt from sklearn.ensemble import GradientBoostingRegressor np.random.seed(1) def f(x): """The function to predict.""" return x * np.sin(x) #---------------------------------------------------------------------- # First the noiseless case X = np.atleast_2d(np.random.uniform(0, 10.0, size=100)).T X = X.astype(np.float32) # Observations y = f(X).

1.8. Cross decomposition

The cross decomposition module contains two main families of algorithms: the partial least squares (PLS) and the canonical correlation analysis (CCA). These families of algorithms are useful to find linear relations between two multivariate datasets: the X and Y arguments of the fit method are 2D arrays. Cross decomposition algorithms find the fundamental relations between two matrices (X and Y). They are latent variable approaches to modeling the covariance structures in these two spaces.

sklearn.datasets.load_mlcomp()

sklearn.datasets.load_mlcomp(name_or_id, set_='raw', mlcomp_root=None, **kwargs) [source] Load a datasets as downloaded from http://mlcomp.org Parameters: name_or_id : the integer id or the string name metadata of the MLComp dataset to load set_ : select the portion to load: ?train?, ?test? or ?raw? mlcomp_root : the filesystem path to the root folder where MLComp datasets are stored, if mlcomp_root is None, the MLCOMP_DATASETS_HOME environment variable is looked up instead. **kwargs :

sklearn.datasets.load_sample_image()

sklearn.datasets.load_sample_image(image_name) [source] Load the numpy array of a single sample image Parameters: image_name: {`china.jpg`, `flower.jpg`} : The name of the sample image loaded Returns: img: 3D array : The image as a numpy array: height x width x color Examples >>> from sklearn.datasets import load_sample_image >>> china = load_sample_image('china.jpg') >>> china.dtype dtype('uint8') >>> china.shape