Module 4: Supervised Learning

4.1 Regression

Regression is used when the output variable is a continuous value. It helps in predicting real-world quantities such as prices, sales, or temperature.

4.1.1 Linear Regression

Models the relationship between inputs and a continuous output using a straight line.

Example 1: Predicting monthly electricity usage from square footage.
Example 2: Estimating the price of a used laptop based on age and brand.
Key Points:

4.1.1.1 Simple Linear Regression

Uses one independent variable to predict a continuous outcome.

Example 1: Predicting crop yield based on rainfall.
Example 2: Estimating weight based on height.
Key Points:

4.1.1.2 Multiple Linear Regression

Involves two or more independent variables to predict the outcome.

Example 1: Predicting house prices based on area, number of rooms, and location.
Example 2: Estimating income from education level, age, and work experience.
Key Points:

4.1.2 Polynomial Regression

Extends linear regression by allowing curved relationships between variables.

Example 1: Modeling population growth over time.
Example 2: Predicting CPU temperature based on load.
Key Points:

4.1.4.1 Pruning in Decision Trees

Reduces model complexity by trimming branches from decision trees.

Example 1: Optimizing decision trees used for loan approvals.
Example 2: Preventing a medical diagnosis model from overfitting.
Key Points:

4.1.5 Random Forest Regressor

Uses an ensemble of decision trees for robust predictions.

Example 1: Predicting rental prices across cities.
Example 2: Estimating yield from farmland data.
Key Points:

4.1.6 Support Vector Regressor (SVR)

Uses support vectors and margin of tolerance to predict values.

Example 1: Predicting stock prices.
Example 2: Modeling house energy usage.
Key Points:

4.1.8 Underfitting

Occurs when a model is too simple to capture data patterns.

Example 1: Using linear regression on non-linear data.
Example 2: Predicting traffic with only time of day.
Key Points:

4.1.9 Overfitting

Occurs when a model memorizes training data, failing to generalize.

Example 1: Deep decision tree with noise.
Example 2: Complex polynomial model on few samples.
Key Points:

4.2 Classification

Classification is used when the output is a discrete class. It answers questions like “Is this email spam?” or “Does this image contain a cat?”

4.2.1 Logistic Regression

Predicts probability of a binary or multi-class outcome.

Example 1: Spam email detection.
Example 2: Predicting customer churn.
Key Points:

4.2.2 KNN Classifier

Classifies based on majority class among k nearest neighbors.

Example 1: Handwritten digit classification.
Example 2: Recommending movies based on user profile.
Key Points:

4.2.3 Naïve Bayes

Uses Bayes' Theorem assuming feature independence.

Example 1: News categorization.
Example 2: Sentiment analysis of reviews.
Key Points:

4.2.3.1 Multinomial

Handles word counts and frequencies in text classification.

4.2.3.2 Gaussian

Assumes continuous features follow a Gaussian distribution.

4.2.4 Decision Tree

Splits data into branches to reach a decision or class label.

Example 1: Disease diagnosis from symptoms.
Example 2: Predicting if a loan will be approved.
Key Points:

4.2.5 Random Forest

Combines many decision trees to vote for the best result.

Example 1: Classifying email as spam or not.
Example 2: Credit card fraud detection.

4.2.6 SVM

Finds the optimal hyperplane to separate classes with the widest margin.

Example 1: Image classification.
Example 2: Face recognition system.
Key Points: