In supervised learning, models are trained using labelled dataset, where the model learns about each type of data.
Once the training process is completed, the model is tested based on test data (a subset of the training set), and then it predicts the output.
Suppose we have a dataset of different types of shapes which includes square, rectangle, triangle, and Polygon. Now the first step is that we need to train the model for each shape.
If the given shape has four sides, and all the sides are equal, then it will be labelled as a Square.
If the given shape has three equal sides, then it will be labelled as a triangle.
If the given shape has six equal sides, then it will be labelled as hexagon.
Now, after training, we test our model using the test set, and the task of the model is to identify the shape.
The machine is already trained on all types of shapes, and when it finds a new shape, how well it classifies the shape based on number of sides and predicts the output.
- First, determine the type of training dataset
- Collect/Gather the labelled training data.
- Split the training dataset into training dataset, test dataset, and validation dataset.
- Determine the input features of the training dataset, which should have enough knowledge so that the model can accurately predict the output.
- Determine the suitable algorithm for the model, such as support vector machine, decision tree, etc.
- Execute the algorithm on the training dataset. Sometimes we need validation sets as the control parameters, which are the subset of training datasets.
- Evaluate the accuracy of the model by providing the providing the test set. If the model predicts the correct output, which means our model is accurate.