Plot several randomly generated 2D classification datasets. This example illustrates the datasets.make_classification
datasets.make_blobs
and datasets.make_gaussian_quantiles
functions.
For make_classification
, three binary and two multi-class classification datasets are generated, with different numbers of informative features and clusters per class.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 | print (__doc__) import matplotlib.pyplot as plt from sklearn.datasets import make_classification from sklearn.datasets import make_blobs from sklearn.datasets import make_gaussian_quantiles plt.figure(figsize = ( 8 , 8 )) plt.subplots_adjust(bottom = . 05 , top = . 9 , left = . 05 , right = . 95 ) plt.subplot( 321 ) plt.title( "One informative feature, one cluster per class" , fontsize = 'small' ) X1, Y1 = make_classification(n_features = 2 , n_redundant = 0 , n_informative = 1 , n_clusters_per_class = 1 ) plt.scatter(X1[:, 0 ], X1[:, 1 ], marker = 'o' , c = Y1) plt.subplot( 322 ) plt.title( "Two informative features, one cluster per class" , fontsize = 'small' ) X1, Y1 = make_classification(n_features = 2 , n_redundant = 0 , n_informative = 2 , n_clusters_per_class = 1 ) plt.scatter(X1[:, 0 ], X1[:, 1 ], marker = 'o' , c = Y1) plt.subplot( 323 ) plt.title( "Two informative features, two clusters per class" , fontsize = 'small' ) X2, Y2 = make_classification(n_features = 2 , n_redundant = 0 , n_informative = 2 ) plt.scatter(X2[:, 0 ], X2[:, 1 ], marker = 'o' , c = Y2) plt.subplot( 324 ) plt.title( "Multi-class, two informative features, one cluster" , fontsize = 'small' ) X1, Y1 = make_classification(n_features = 2 , n_redundant = 0 , n_informative = 2 , n_clusters_per_class = 1 , n_classes = 3 ) plt.scatter(X1[:, 0 ], X1[:, 1 ], marker = 'o' , c = Y1) plt.subplot( 325 ) plt.title( "Three blobs" , fontsize = 'small' ) X1, Y1 = make_blobs(n_features = 2 , centers = 3 ) plt.scatter(X1[:, 0 ], X1[:, 1 ], marker = 'o' , c = Y1) plt.subplot( 326 ) plt.title( "Gaussian divided into three quantiles" , fontsize = 'small' ) X1, Y1 = make_gaussian_quantiles(n_features = 2 , n_classes = 3 ) plt.scatter(X1[:, 0 ], X1[:, 1 ], marker = 'o' , c = Y1) plt.show() |
Total running time of the script: (0 minutes 0.319 seconds)
Download Python source code:
plot_random_dataset.py
Download IPython notebook:
plot_random_dataset.ipynb
Please login to continue.