The problem solved in supervised learning
Section contents In this section, we introduce the
Often the hardest part of solving a machine learning problem can be finding the right estimator for the job. Different estimators are better suited for different types of data and different
Pipelining We have seen that some estimators can transform data and that some estimators can predict variables. We can also create combined estimators:
Statistical learning
Datasets Scikit-learn deals with learning information from one or more datasets that are represented as 2D arrays. They can be understood as a list of multi-dimensional observations
Score, and cross-validated scores As we have seen, every estimator exposes a score method that can judge the
Clustering: grouping observations together The problem solved in clustering Given the
The goal of this guide is to explore some of the main scikit-learn tools on a single practical task: analysing a collection of text documents (newsgroups
An introduction to machine learning