sklearn.model_selection.learning_curve()

sklearn.model_selection.learning_curve(estimator, X, y, groups=None, train_sizes=array([ 0.1, 0.33, 0.55, 0.78, 1. ]), cv=None, scoring=None, exploit_incremental_learning=False, n_jobs=1, pre_dispatch='all', verbose=0) [source] Learning curve. Determines cross-validated training and test scores for different training set sizes. A cross-validation generator splits the whole dataset k times in training and test data. Subsets of the training set with varying sizes will be used to train the est

linear_model.MultiTaskElasticNet()

class sklearn.linear_model.MultiTaskElasticNet(alpha=1.0, l1_ratio=0.5, fit_intercept=True, normalize=False, copy_X=True, max_iter=1000, tol=0.0001, warm_start=False, random_state=None, selection='cyclic') [source] Multi-task ElasticNet model trained with L1/L2 mixed-norm as regularizer The optimization objective for MultiTaskElasticNet is: (1 / (2 * n_samples)) * ||Y - XW||^Fro_2 + alpha * l1_ratio * ||W||_21 + 0.5 * alpha * (1 - l1_ratio) * ||W||_Fro^2 Where: ||W||_21 = \sum_i \sqrt{\sum

ensemble.IsolationForest()

class sklearn.ensemble.IsolationForest(n_estimators=100, max_samples='auto', contamination=0.1, max_features=1.0, bootstrap=False, n_jobs=1, random_state=None, verbose=0) [source] Isolation Forest Algorithm Return the anomaly score of each sample using the IsolationForest algorithm The IsolationForest ?isolates? observations by randomly selecting a feature and then randomly selecting a split value between the maximum and minimum values of the selected feature. Since recursive partitioning c

pipeline.FeatureUnion()

class sklearn.pipeline.FeatureUnion(transformer_list, n_jobs=1, transformer_weights=None) [source] Concatenates results of multiple transformer objects. This estimator applies a list of transformer objects in parallel to the input data, then concatenates the results. This is useful to combine several feature extraction mechanisms into a single transformer. Parameters of the transformers may be set using its name and the parameter name separated by a ?__?. A transformer may be replaced entir

preprocessing.MinMaxScaler()

class sklearn.preprocessing.MinMaxScaler(feature_range=(0, 1), copy=True) [source] Transforms features by scaling each feature to a given range. This estimator scales and translates each feature individually such that it is in the given range on the training set, i.e. between zero and one. The transformation is given by: X_std = (X - X.min(axis=0)) / (X.max(axis=0) - X.min(axis=0)) X_scaled = X_std * (max - min) + min where min, max = feature_range. This transformation is often used as an

dummy.DummyClassifier()

class sklearn.dummy.DummyClassifier(strategy='stratified', random_state=None, constant=None) [source] DummyClassifier is a classifier that makes predictions using simple rules. This classifier is useful as a simple baseline to compare with other (real) classifiers. Do not use it for real problems. Read more in the User Guide. Parameters: strategy : str, default=?stratified? Strategy to use to generate predictions. ?stratified?: generates predictions by respecting the training set?s class

An introduction to machine learning with scikit-learn

Section contents In this section, we introduce the machine learning vocabulary that we use throughout scikit-learn and give a simple learning example. Machine learning: the problem setting In general, a learning problem considers a set of n samples of data and then tries to predict properties of unknown data. If each sample is more than a single number and, for instance, a multi-dimensional entry (aka multivariate data), it is said to have several attributes or features. We can separate lea

1.9. Naive Bayes

Naive Bayes methods are a set of supervised learning algorithms based on applying Bayes? theorem with the ?naive? assumption of independence between every pair of features. Given a class variable and a dependent feature vector through , Bayes? theorem states the following relationship: Using the naive independence assumption that for all , this relationship is simplified to Since is constant given the input, we can use the following classification rule: and we can use Maximum A

preprocessing.Normalizer()

class sklearn.preprocessing.Normalizer(norm='l2', copy=True) [source] Normalize samples individually to unit norm. Each sample (i.e. each row of the data matrix) with at least one non zero component is rescaled independently of other samples so that its norm (l1 or l2) equals one. This transformer is able to work both with dense numpy arrays and scipy.sparse matrix (use CSR format if you want to avoid the burden of a copy / conversion). Scaling inputs to unit norms is a common operation for

neighbors.RadiusNeighborsClassifier()

class sklearn.neighbors.RadiusNeighborsClassifier(radius=1.0, weights='uniform', algorithm='auto', leaf_size=30, p=2, metric='minkowski', outlier_label=None, metric_params=None, **kwargs) [source] Classifier implementing a vote among neighbors within a given radius Read more in the User Guide. Parameters: radius : float, optional (default = 1.0) Range of parameter space to use by default for :meth`radius_neighbors` queries. weights : str or callable weight function used in prediction. P