-
sklearn.utils.resample(*arrays, **options)
[source] -
Resample arrays or sparse matrices in a consistent way
The default strategy implements one step of the bootstrapping procedure.
Parameters: *arrays : sequence of indexable data-structures
Indexable data-structures can be arrays, lists, dataframes or scipy sparse matrices with consistent first dimension.
replace : boolean, True by default
Implements resampling with replacement. If False, this will implement (sliced) random permutations.
n_samples : int, None by default
Number of samples to generate. If left to None this is automatically set to the first dimension of the arrays. If replace is False it should not be larger than the length of arrays.
random_state : int or RandomState instance
Control the shuffling for reproducible behavior.
Returns: resampled_arrays : sequence of indexable data-structures
Sequence of resampled views of the collections. The original arrays are not impacted.
See also
Examples
It is possible to mix sparse and dense arrays in the same run:
123456789101112131415161718192021222324252627>>> X
=
np.array([[
1.
,
0.
], [
2.
,
1.
], [
0.
,
0.
]])
>>> y
=
np.array([
0
,
1
,
2
])
>>>
from
scipy.sparse
import
coo_matrix
>>> X_sparse
=
coo_matrix(X)
>>>
from
sklearn.utils
import
resample
>>> X, X_sparse, y
=
resample(X, X_sparse, y, random_state
=
0
)
>>> X
array([[
1.
,
0.
],
[
2.
,
1.
],
[
1.
,
0.
]])
>>> X_sparse
<
3x2
sparse matrix of
type
'<... '
numpy.float64
'>'
with
4
stored elements
in
Compressed Sparse Row
format
>
>>> X_sparse.toarray()
array([[
1.
,
0.
],
[
2.
,
1.
],
[
1.
,
0.
]])
>>> y
array([
0
,
1
,
0
])
>>> resample(y, n_samples
=
2
, random_state
=
0
)
array([
0
,
1
])
sklearn.utils.resample()

2025-01-10 15:47:30
Please login to continue.