6. Unsupervised Learning

6.1 Clustering

Definition: Clustering is an unsupervised learning technique that groups similar data points into clusters without labeled data.

Explanation: It helps discover hidden patterns or groupings in data and is widely used in customer segmentation, social network analysis, and image compression.

        🧾 Examples:

        - Grouping news articles by topic.

        - Segmenting customers by purchasing behavior.

        💡 Key Points:
        No labeled data required.
Output is groups/clusters with high intra-group similarity.
Popular algorithms: K-Means, DBSCAN, Hierarchical.

      

6.1.1 K-Means

Definition: K-Means is a centroid-based clustering algorithm that partitions data into K clusters based on proximity to cluster centers.

Explanation: The algorithm minimizes intra-cluster distances by assigning each point to the nearest centroid and then recalculating centroids until convergence.

        🧾 Examples:

        - Customer segmentation in marketing.

        - Image color quantization.

        📊 Facts:

        - Requires K to be predefined.

        - Sensitive to initial centroids.

        - Fast and scalable.

6.1.1.1 Elbow Method

Definition: The Elbow Method is a technique to find the optimal number of clusters (K) by plotting within-cluster variance versus number of clusters.

        🧾 Example:

        - Used in K-Means to determine the ideal K in customer segmentation.

        💡 Look for the "elbow point" where adding more clusters doesn’t significantly reduce the variance.
      

6.1.1.2 K-Means++

Definition: K-Means++ is an enhanced version of K-Means that improves the initial centroid selection to avoid poor clustering results.

        🧾 Example:

        - Used in clustering large-scale location data with better consistency.

📊 Improves convergence and clustering accuracy over standard K-Means.

6.1.2 Hierarchical Clustering

Definition: Hierarchical clustering builds a tree of clusters by either merging or splitting them recursively.

        🧾 Example:

        - Taxonomy of species.

        - Document categorization.

💡 Doesn’t require predefining K and creates dendrograms for visualization.

6.1.2.1 Agglomerative Clustering

Definition: A bottom-up approach where each data point starts as its own cluster, and pairs of clusters are merged as one moves up the hierarchy.

    🧾 Example:

    - Grouping species by genetic similarity in biology.

    - Merging social network users based on mutual friends.

    💡 Key Points:
    Does not require specifying the number of clusters (K).
Produces a dendrogram to visualize cluster merging.
Slower than K-Means on large datasets due to repeated distance calculations.

  

6.1.2.2 Divisive Clustering

Definition: A top-down approach that starts with all data in one cluster and recursively splits it into smaller clusters.

    🧾 Example:

    - Separating academic departments from a university-wide dataset.

    - Breaking down a customer base from general to specific segments.

    💡 Key Points:
    Opposite of agglomerative — starts big and splits downward.
Less commonly used due to higher computational complexity.
Useful when the natural grouping is known to start large.

  

6.1.2.3 Dendrograms

Definition: A dendrogram is a tree-like diagram that records the sequences of merges or splits in hierarchical clustering.

    🧾 Example:

    - Visualizing how different types of wines group based on taste features.

    - Showing user communities in a social media clustering analysis.

    💡 Key Points:
    Shows the entire clustering process and merging steps.
Cutting the dendrogram at a specific height reveals K clusters.
Helps evaluate natural grouping and hierarchy in the data.

  

6.2 Association

Definition: Association learning is a rule-based machine learning method for discovering interesting relations between variables in large datasets.

Explanation: It finds frequent patterns, associations, or correlations in data. Often used for market basket analysis to find products bought together.

        🧾 Examples:

        - "People who buy bread often buy butter."

        - Amazon recommendation rules.

        💡 Key Points:
        Uses support, confidence, and lift to evaluate rules.
Apriori and FP-Growth are common algorithms.
Highly interpretable rules.