Supervised Learning

Its all about training various models on labeled data to make predictions. Goal of supervised learning is to make accurate predictions on unseen data.

There are 2 types of supervised learning

  1. Classification – categorical variable
  2. Regression – continuous variable

Few basics terminologies:

  1. Features
  2. Labels
  3. Accuracy

Features are nothing but inputs. Sometimes they are also known as independent variable. In machine learning its always represented as X

Labels are the target variable thats also known as dependent variable or response variable. In machine learning its always represented as y. And y is a function of X. So y is like an output for the function X which is the input.

y=f(X)

Accuracy is the correct prediction towards the total observation. Classification always depends on accuracy.

Accuracy = Correct prediction / Total Observation

There are few requirements before performing the training.

  • The dataset should not have any missing values (recommended)
  • The data should be in numeric format
  • Data should be in a dataframe or as an array

So once you have a dataset, you need to do the preliminary activities like data cleaning and missing value treatments before proceeding with training the model

To solve the Classification problem there are few steps involved.

  1. Get the dataset
  2. Split the dataset (Train/Test)
  3. Fit the dataset (train the model)
  4. Predict the model (test the model output again your actual test data)
  5. Score the model (how accurate is your model)

Leave a Reply

Your email address will not be published. Required fields are marked *