model_selection.LeaveOneOut

class sklearn.model_selection.LeaveOneOut [source] Leave-One-Out cross-validator Provides train/test indices to split data in train/test sets. Each sample is used once as a test set (singleton) while the remaining samples form the training set. Note: LeaveOneOut() is equivalent to KFold(n_splits=n) and LeavePOut(p=1) where n is the number of samples. Due to the high number of test sets (which is the same as the number of samples) this cross-validation method can be very costly. For large da

neighbors.RadiusNeighborsRegressor()

class sklearn.neighbors.RadiusNeighborsRegressor(radius=1.0, weights='uniform', algorithm='auto', leaf_size=30, p=2, metric='minkowski', metric_params=None, **kwargs) [source] Regression based on neighbors within a fixed radius. The target is predicted by local interpolation of the targets associated of the nearest neighbors in the training set. Read more in the User Guide. Parameters: radius : float, optional (default = 1.0) Range of parameter space to use by default for radius_neighbors

neighbors.LSHForest()

class sklearn.neighbors.LSHForest(n_estimators=10, radius=1.0, n_candidates=50, n_neighbors=5, min_hash_match=4, radius_cutoff_ratio=0.9, random_state=None) [source] Performs approximate nearest neighbor search using LSH forest. LSH Forest: Locality Sensitive Hashing forest [1] is an alternative method for vanilla approximate nearest neighbor search methods. LSH forest data structure has been implemented using sorted arrays and binary search and 32 bit fixed-length hashes. Random projection

model_selection.GroupKFold()

class sklearn.model_selection.GroupKFold(n_splits=3) [source] K-fold iterator variant with non-overlapping groups. The same group will not appear in two different folds (the number of distinct groups has to be at least equal to the number of folds). The folds are approximately balanced in the sense that the number of distinct groups is approximately the same in each fold. Parameters: n_splits : int, default=3 Number of folds. Must be at least 2. See also LeaveOneGroupOut For splitti

exceptions.DataConversionWarning

class sklearn.exceptions.DataConversionWarning [source] Warning used to notify implicit data conversions happening in the code. This warning occurs when some input data needs to be converted or interpreted in a way that may not match the user?s expectations. For example, this warning may occur when the user passes an integer array to a function which expects float input and will convert the input requests a non-copying operation, but a copy is required to meet the implementation?s data-typ

Hashing feature transformation using Totally Random Trees

RandomTreesEmbedding provides a way to map data to a very high-dimensional, sparse representation, which might be beneficial for classification. The mapping is completely unsupervised and very efficient. This example visualizes the partitions given by several trees and shows how the transformation can also be used for non-linear dimensionality reduction or non-linear classification. Points that are neighboring often share the same leaf of a tree and therefore share large parts of their hashed

The Digit Dataset

This dataset is made up of 1797 8x8 images. Each image, like the one shown below, is of a hand-written digit. In order to utilize an 8x8 figure like this, we?d have to first transform it into a feature vector with length 64. See here for more information about this dataset. print(__doc__) # Code source: Ga Varoquaux # Modified for documentation by Jaques Grobler # License: BSD 3 clause from sklearn import datasets import matplotlib.pyplot as plt #Load the digits dataset digits = datas

sklearn.preprocessing.minmax_scale()

sklearn.preprocessing.minmax_scale(X, feature_range=(0, 1), axis=0, copy=True) [source] Transforms features by scaling each feature to a given range. This estimator scales and translates each feature individually such that it is in the given range on the training set, i.e. between zero and one. The transformation is given by: X_std = (X - X.min(axis=0)) / (X.max(axis=0) - X.min(axis=0)) X_scaled = X_std * (max - min) + min where min, max = feature_range. This transformation is often used a

ensemble.BaggingClassifier()

class sklearn.ensemble.BaggingClassifier(base_estimator=None, n_estimators=10, max_samples=1.0, max_features=1.0, bootstrap=True, bootstrap_features=False, oob_score=False, warm_start=False, n_jobs=1, random_state=None, verbose=0) [source] A Bagging classifier. A Bagging classifier is an ensemble meta-estimator that fits base classifiers each on random subsets of the original dataset and then aggregate their individual predictions (either by voting or by averaging) to form a final predictio

SGD: Maximum margin separating hyperplane

Plot the maximum margin separating hyperplane within a two-class separable dataset using a linear Support Vector Machines classifier trained using SGD. print(__doc__) import numpy as np import matplotlib.pyplot as plt from sklearn.linear_model import SGDClassifier from sklearn.datasets.samples_generator import make_blobs # we create 50 separable points X, Y = make_blobs(n_samples=50, centers=2, random_state=0, cluster_std=0.60) # fit the model clf = SGDClassifier(loss="hinge", alpha=0.0