-
Notifications
You must be signed in to change notification settings - Fork 2
ML Exam Prep
- Explain Soft K-Means clustering with the help of an example.
Answer: Absolutely! Let’s walk through a step-by-step example of Soft K-Means on a small dataset with clear numbers — so it becomes intuitive.
-
Data points:
x₁ = 1.0 x₂ = 2.0 x₃ = 5.0
-
Number of clusters (K): 2
-
Initial centroids:
μ₁ = 1.5 μ₂ = 4.5
-
Softness parameter
$( \beta = 1 )$ (you can try larger for more "hard" results)
We use the formula:
Let’s compute for each point:
- Distance to μ₁ = (1.0 − 1.5)² = 0.25
- Distance to μ₂ = (1.0 − 4.5)² = 12.25
$$
r_{11} = \frac{e^{-0.25}}{e^{-0.25} + e^{-12.25}} ≈ \frac{0.7788}{0.7788 + 4.8e^{-6}} ≈ 0.9999938 ] [ r_{12} = 1 - r_{11} ≈ 0.0000062
$$
- Distance to μ₁ = (2.0 − 1.5)² = 0.25
- Distance to μ₂ = (2.0 − 4.5)² = 6.25
- Distance to μ₁ = (5.0 − 1.5)² = 12.25
- Distance to μ₂ = (5.0 − 4.5)² = 0.25
Use the weighted average formula:
Numerator:
$$
r_{11}·1.0 + r_{21}·2.0 + r_{31}·5.0 ≈ 0.9999938·1.0 + 0.9976·2.0 + 0.0000062·5.0 = 0.9999938 + 1.9952 + 0.000031 ≈ 2.9952
$$
Denominator:
$$
r_{11} + r_{21} + r_{31} ≈ 0.9999938 + 0.9976 + 0.0000062 = 1.9976
$$
Numerator:
$$
r_{12}·1.0 + r_{22}·2.0 + r_{32}·5.0 ≈ 0.0000062·1.0 + 0.0024·2.0 + 0.9999938·5.0 = 0.0000062 + 0.0048 + 4.999969 ≈ 5.0048
$$
Denominator:
$$
r_{12} + r_{22} + r_{32} ≈ 0.0000062 + 0.0024 + 0.9999938 ≈ 1.0024
$$
Use new centroids μ₁ = 1.5, μ₂ = 4.995 and repeat Steps 1 and 2 until centroids stop changing.
- Soft K-Means doesn’t assign each point to one cluster, but to all clusters with probabilities.
- These probabilities depend on distance and β.
- Centroids are updated based on weighted averages of points using these soft responsibilities.