Skip to content

ML Exam Prep

Praveen Kumar Anwla edited this page Apr 30, 2025 · 1 revision
  1. Explain Soft K-Means clustering with the help of an example.

Answer: Absolutely! Let’s walk through a step-by-step example of Soft K-Means on a small dataset with clear numbers — so it becomes intuitive.


Goal: Cluster 3 data points into 2 clusters using Soft K-Means


Step 0: Setup

  • Data points:

    x₁ = 1.0
    x₂ = 2.0
    x₃ = 5.0
    
  • Number of clusters (K): 2

  • Initial centroids:

    μ₁ = 1.5
    μ₂ = 4.5
    
  • Softness parameter $( \beta = 1 )$ (you can try larger for more "hard" results)


🔁 Iteration 1


Step 1: Compute responsibilities

We use the formula:

$$ r_{ik} = \frac{e^{-\beta (x_i - \mu_k)^2}}{\sum_{j=1}^{K} e^{-\beta (x_i - \mu_j)^2}} $$

Let’s compute for each point:


For ( x₁ = 1.0 ):

  • Distance to μ₁ = (1.0 − 1.5)² = 0.25
  • Distance to μ₂ = (1.0 − 4.5)² = 12.25 $$
    r_{11} = \frac{e^{-0.25}}{e^{-0.25} + e^{-12.25}} ≈ \frac{0.7788}{0.7788 + 4.8e^{-6}} ≈ 0.9999938 ] [ r_{12} = 1 - r_{11} ≈ 0.0000062
    $$

For $( x₂ = 2.0 )$:

  • Distance to μ₁ = (2.0 − 1.5)² = 0.25
  • Distance to μ₂ = (2.0 − 4.5)² = 6.25

$$ r_{21} = \frac{e^{-0.25}}{e^{-0.25} + e^{-6.25}} ≈ \frac{0.7788}{0.7788 + 0.0019} ≈ 0.9976 ] [ r_{22} = 1 - r_{21} ≈ 0.0024 $$


For ( x₃ = 5.0 ):

  • Distance to μ₁ = (5.0 − 1.5)² = 12.25
  • Distance to μ₂ = (5.0 − 4.5)² = 0.25

$$ r_{31} = \frac{e^{-12.25}}{e^{-12.25} + e^{-0.25}} ≈ \frac{4.8e^{-6}}{4.8e^{-6} + 0.7788} ≈ 0.0000062 $$

$$ r_{32} = 1 - r_{31} ≈ 0.9999938 $$


Step 2: Update centroids

Use the weighted average formula:

$$ \mu_k = \frac{\sum r_{ik} \cdot x_i}{\sum r_{ik}} $$


For ( \mu₁ ):

Numerator: $$
r_{11}·1.0 + r_{21}·2.0 + r_{31}·5.0 ≈ 0.9999938·1.0 + 0.9976·2.0 + 0.0000062·5.0 = 0.9999938 + 1.9952 + 0.000031 ≈ 2.9952
$$

Denominator: $$
r_{11} + r_{21} + r_{31} ≈ 0.9999938 + 0.9976 + 0.0000062 = 1.9976
$$

$$ \mu₁^{new} ≈ \frac{2.9952}{1.9976} ≈ 1.5 $$


For ( \mu₂ ):

Numerator: $$
r_{12}·1.0 + r_{22}·2.0 + r_{32}·5.0 ≈ 0.0000062·1.0 + 0.0024·2.0 + 0.9999938·5.0 = 0.0000062 + 0.0048 + 4.999969 ≈ 5.0048
$$

Denominator: $$
r_{12} + r_{22} + r_{32} ≈ 0.0000062 + 0.0024 + 0.9999938 ≈ 1.0024
$$

$$ \mu₂^{new} ≈ \frac{5.0048}{1.0024} ≈ 4.995 $$


🔄 Repeat

Use new centroids μ₁ = 1.5, μ₂ = 4.995 and repeat Steps 1 and 2 until centroids stop changing.


Summary:

  • Soft K-Means doesn’t assign each point to one cluster, but to all clusters with probabilities.
  • These probabilities depend on distance and β.
  • Centroids are updated based on weighted averages of points using these soft responsibilities.