Comparison of kernel ridge and Gaussian process regression

Both kernel ridge regression (KRR) and Gaussian process regression (GPR) learn a target function by employing internally the ?kernel trick?. KRR learns a linear function in the space induced by the respective kernel which corresponds to a non-linear function in the original space. The linear function in the kernel space is chosen based on the mean-squared error loss with ridge regularization. GPR uses the kernel to define the covariance of a prior distribution over the target functions and use

kernel_approximation.RBFSampler()

class sklearn.kernel_approximation.RBFSampler(gamma=1.0, n_components=100, random_state=None) [source] Approximates feature map of an RBF kernel by Monte Carlo approximation of its Fourier transform. It implements a variant of Random Kitchen Sinks.[1] Read more in the User Guide. Parameters: gamma : float Parameter of RBF kernel: exp(-gamma * x^2) n_components : int Number of Monte Carlo samples per original feature. Equals the dimensionality of the computed feature space. random_state

tree.DecisionTreeRegressor()

class sklearn.tree.DecisionTreeRegressor(criterion='mse', splitter='best', max_depth=None, min_samples_split=2, min_samples_leaf=1, min_weight_fraction_leaf=0.0, max_features=None, random_state=None, max_leaf_nodes=None, min_impurity_split=1e-07, presort=False) [source] A decision tree regressor. Read more in the User Guide. Parameters: criterion : string, optional (default=?mse?) The function to measure the quality of a split. Supported criteria are ?mse? for the mean squared error, whic

sklearn.cluster.ward_tree()

sklearn.cluster.ward_tree(X, connectivity=None, n_clusters=None, return_distance=False) [source] Ward clustering based on a Feature matrix. Recursively merges the pair of clusters that minimally increases within-cluster variance. The inertia matrix uses a Heapq-based representation. This is the structured version, that takes into account some topological structure between samples. Read more in the User Guide. Parameters: X : array, shape (n_samples, n_features) feature matrix representing

sklearn.feature_selection.f_regression()

sklearn.feature_selection.f_regression(X, y, center=True) [source] Univariate linear regression tests. Quick linear model for testing the effect of a single regressor, sequentially for many regressors. This is done in 2 steps: The cross correlation between each regressor and the target is computed, that is, ((X[:, i] - mean(X[:, i])) * (y - mean_y)) / (std(X[:, i]) * std(y)). It is converted to an F score then to a p-value. Read more in the User Guide. Parameters: X : {array-like, sparse m

Compare cross decomposition methods

Simple usage of various cross decomposition algorithms: - PLSCanonical - PLSRegression, with multivariate response, a.k.a. PLS2 - PLSRegression, with univariate response, a.k.a. PLS1 - CCA Given 2 multivariate covarying two-dimensional datasets, X, and Y, PLS extracts the ?directions of covariance?, i.e. the components of each datasets that explain the most shared variance between both datasets. This is apparent on the scatterplot matrix display: components 1 in dataset X and dataset Y are max

sklearn.datasets.fetch_20newsgroups()

sklearn.datasets.fetch_20newsgroups(data_home=None, subset='train', categories=None, shuffle=True, random_state=42, remove=(), download_if_missing=True) [source] Load the filenames and data from the 20 newsgroups dataset. Read more in the User Guide. Parameters: subset : ?train? or ?test?, ?all?, optional Select the dataset to load: ?train? for the training set, ?test? for the test set, ?all? for both, with shuffled ordering. data_home : optional, default: None Specify a download and ca

3.2. Tuning the hyper-parameters of an estimator

Hyper-parameters are parameters that are not directly learnt within estimators. In scikit-learn they are passed as arguments to the constructor of the estimator classes. Typical examples include C, kernel and gamma for Support Vector Classifier, alpha for Lasso, etc. It is possible and recommended to search the hyper-parameter space for the best Cross-validation: evaluating estimator performance score. Any parameter provided when constructing an estimator may be optimized in this manner. Speci

neural_network.BernoulliRBM()

class sklearn.neural_network.BernoulliRBM(n_components=256, learning_rate=0.1, batch_size=10, n_iter=10, verbose=0, random_state=None) [source] Bernoulli Restricted Boltzmann Machine (RBM). A Restricted Boltzmann Machine with binary visible units and binary hidden units. Parameters are estimated using Stochastic Maximum Likelihood (SML), also known as Persistent Contrastive Divergence (PCD) [2]. The time complexity of this implementation is O(d ** 2) assuming d ~ n_features ~ n_components.

preprocessing.PolynomialFeatures()

class sklearn.preprocessing.PolynomialFeatures(degree=2, interaction_only=False, include_bias=True) [source] Generate polynomial and interaction features. Generate a new feature matrix consisting of all polynomial combinations of the features with degree less than or equal to the specified degree. For example, if an input sample is two dimensional and of the form [a, b], the degree-2 polynomial features are [1, a, b, a^2, ab, b^2]. Parameters: degree : integer The degree of the polynomial