sklearn.decomposition.dict_learning()

sklearn.decomposition.dict_learning(X, n_components, alpha, max_iter=100, tol=1e-08, method='lars', n_jobs=1, dict_init=None, code_init=None, callback=None, verbose=False, random_state=None, return_n_iter=False) [source] Solves a dictionary learning matrix factorization problem. Finds the best dictionary and the corresponding sparse code for approximating the data matrix X by solving: (U^*, V^*) = argmin 0.5 || X - U V ||_2^2 + alpha * || U ||_1 (U,V) with || V_k ||

sklearn.datasets.mldata_filename()

sklearn.datasets.mldata_filename(dataname) [source] Convert a raw name for a data set in a mldata.org filename.

sklearn.datasets.make_s_curve()

sklearn.datasets.make_s_curve(n_samples=100, noise=0.0, random_state=None) [source] Generate an S curve dataset. Read more in the User Guide. Parameters: n_samples : int, optional (default=100) The number of sample points on the S curve. noise : float, optional (default=0.0) The standard deviation of the gaussian noise. random_state : int, RandomState instance or None, optional (default=None) If int, random_state is the seed used by the random number generator; If RandomState instance

sklearn.datasets.make_swiss_roll()

sklearn.datasets.make_swiss_roll(n_samples=100, noise=0.0, random_state=None) [source] Generate a swiss roll dataset. Read more in the User Guide. Parameters: n_samples : int, optional (default=100) The number of sample points on the S curve. noise : float, optional (default=0.0) The standard deviation of the gaussian noise. random_state : int, RandomState instance or None, optional (default=None) If int, random_state is the seed used by the random number generator; If RandomState ins

sklearn.datasets.make_sparse_uncorrelated()

sklearn.datasets.make_sparse_uncorrelated(n_samples=100, n_features=10, random_state=None) [source] Generate a random regression problem with sparse uncorrelated design This dataset is described in Celeux et al [1]. as: X ~ N(0, 1) y(X) = X[:, 0] + 2 * X[:, 1] - 2 * X[:, 2] - 1.5 * X[:, 3] Only the first 4 features are informative. The remaining features are useless. Read more in the User Guide. Parameters: n_samples : int, optional (default=100) The number of samples. n_features : int,

sklearn.datasets.make_spd_matrix()

sklearn.datasets.make_spd_matrix(n_dim, random_state=None) [source] Generate a random symmetric, positive-definite matrix. Read more in the User Guide. Parameters: n_dim : int The matrix dimension. random_state : int, RandomState instance or None, optional (default=None) If int, random_state is the seed used by the random number generator; If RandomState instance, random_state is the random number generator; If None, the random number generator is the RandomState instance used by np.ran

sklearn.datasets.make_sparse_spd_matrix()

sklearn.datasets.make_sparse_spd_matrix(dim=1, alpha=0.95, norm_diag=False, smallest_coef=0.1, largest_coef=0.9, random_state=None) [source] Generate a sparse symmetric definite positive matrix. Read more in the User Guide. Parameters: dim : integer, optional (default=1) The size of the random matrix to generate. alpha : float between 0 and 1, optional (default=0.95) The probability that a coefficient is zero (see notes). Larger values enforce more sparsity. random_state : int, RandomS

sklearn.datasets.make_regression()

sklearn.datasets.make_regression(n_samples=100, n_features=100, n_informative=10, n_targets=1, bias=0.0, effective_rank=None, tail_strength=0.5, noise=0.0, shuffle=True, coef=False, random_state=None) [source] Generate a random regression problem. The input set can either be well conditioned (by default) or have a low rank-fat tail singular profile. See make_low_rank_matrix for more details. The output is generated by applying a (potentially biased) random linear regression model with n_inf

sklearn.datasets.make_sparse_coded_signal()

sklearn.datasets.make_sparse_coded_signal(n_samples, n_components, n_features, n_nonzero_coefs, random_state=None) [source] Generate a signal as a sparse combination of dictionary elements. Returns a matrix Y = DX, such as D is (n_features, n_components), X is (n_components, n_samples) and each column of X has exactly n_nonzero_coefs non-zero elements. Read more in the User Guide. Parameters: n_samples : int number of samples to generate n_components: int, : number of components in the

sklearn.datasets.make_multilabel_classification()

sklearn.datasets.make_multilabel_classification(n_samples=100, n_features=20, n_classes=5, n_labels=2, length=50, allow_unlabeled=True, sparse=False, return_indicator='dense', return_distributions=False, random_state=None) [source] Generate a random multilabel classification problem. For each sample, the generative process is: pick the number of labels: n ~ Poisson(n_labels) n times, choose a class c: c ~ Multinomial(theta) pick the document length: k ~ Poisson(length) k times, choose a wo