Joint feature selection with multi-task Lasso

The multi-task lasso allows to fit multiple regression problems jointly enforcing the selected features to be the same across tasks. This example simulates sequential measurements, each task is a time instant, and the relevant features vary in amplitude over time while being the same. The multi-task lasso imposes that features that are selected at one time point are select for all time point. This makes feature selection by the Lasso more stable. print(__doc__) # Author: Alexandre Gramfort &l

sklearn.datasets.make_sparse_spd_matrix()

sklearn.datasets.make_sparse_spd_matrix(dim=1, alpha=0.95, norm_diag=False, smallest_coef=0.1, largest_coef=0.9, random_state=None) [source] Generate a sparse symmetric definite positive matrix. Read more in the User Guide. Parameters: dim : integer, optional (default=1) The size of the random matrix to generate. alpha : float between 0 and 1, optional (default=0.95) The probability that a coefficient is zero (see notes). Larger values enforce more sparsity. random_state : int, RandomS

sklearn.datasets.make_circles()

sklearn.datasets.make_circles(n_samples=100, shuffle=True, noise=None, random_state=None, factor=0.8) [source] Make a large circle containing a smaller circle in 2d. A simple toy dataset to visualize clustering and classification algorithms. Read more in the User Guide. Parameters: n_samples : int, optional (default=100) The total number of points generated. shuffle: bool, optional (default=True) : Whether to shuffle the samples. noise : double or None (default=None) Standard deviatio

sklearn.metrics.jaccard_similarity_score()

sklearn.metrics.jaccard_similarity_score(y_true, y_pred, normalize=True, sample_weight=None) [source] Jaccard similarity coefficient score The Jaccard index [1], or Jaccard similarity coefficient, defined as the size of the intersection divided by the size of the union of two label sets, is used to compare set of predicted labels for a sample to the corresponding set of labels in y_true. Read more in the User Guide. Parameters: y_true : 1d array-like, or label indicator array / sparse matr

IsolationForest example

An example using IsolationForest for anomaly detection. The IsolationForest ?isolates? observations by randomly selecting a feature and then randomly selecting a split value between the maximum and minimum values of the selected feature. Since recursive partitioning can be represented by a tree structure, the number of splittings required to isolate a sample is equivalent to the path length from the root node to the terminating node. This path length, averaged over a forest of such random tree

Model selection

Score, and cross-validated scores As we have seen, every estimator exposes a score method that can judge the quality of the fit (or the prediction) on new data. Bigger is better. >>> from sklearn import datasets, svm >>> digits = datasets.load_digits() >>> X_digits = digits.data >>> y_digits = digits.target >>> svc = svm.SVC(C=1, kernel='linear') >>> svc.fit(X_digits[:-100], y_digits[:-100]).score(X_digits[-100:], y_digits[-100:]) 0.979999

cluster.MeanShift()

class sklearn.cluster.MeanShift(bandwidth=None, seeds=None, bin_seeding=False, min_bin_freq=1, cluster_all=True, n_jobs=1) [source] Mean shift clustering using a flat kernel. Mean shift clustering aims to discover ?blobs? in a smooth density of samples. It is a centroid-based algorithm, which works by updating candidates for centroids to be the mean of the points within a given region. These candidates are then filtered in a post-processing stage to eliminate near-duplicates to form the fin

Gaussian process regression with noise-level estimation

This example illustrates that GPR with a sum-kernel including a WhiteKernel can estimate the noise level of data. An illustration of the log-marginal-likelihood (LML) landscape shows that there exist two local maxima of LML. The first corresponds to a model with a high noise level and a large length scale, which explains all variations in the data by noise. The second one has a smaller noise level and shorter length scale, which explains most of the variation by the noise-free functional relat

sklearn.ensemble.partial_dependence.plot_partial_dependence()

sklearn.ensemble.partial_dependence.plot_partial_dependence(gbrt, X, features, feature_names=None, label=None, n_cols=3, grid_resolution=100, percentiles=(0.05, 0.95), n_jobs=1, verbose=0, ax=None, line_kw=None, contour_kw=None, **fig_kw) [source] Partial dependence plots for features. The len(features) plots are arranged in a grid with n_cols columns. Two-way partial dependence plots are plotted as contour plots. Read more in the User Guide. Parameters: gbrt : BaseGradientBoosting A fitt

model_selection.LeaveOneGroupOut

class sklearn.model_selection.LeaveOneGroupOut [source] Leave One Group Out cross-validator Provides train/test indices to split data according to a third-party provided group. This group information can be used to encode arbitrary domain specific stratifications of the samples as integers. For instance the groups could be the year of collection of the samples and thus allow for cross-validation against time-based splits. Read more in the User Guide. Examples >>> from sklearn.model