1. Definition
- Bias
- Error due to oversimplifying the model.
- Model makes strong assumptions and fails to capture true patterns.
- Leads to underfitting.
- Variance
- Error due to too much sensitivity to training data.
- Model captures noise and fluctuations instead of just patterns.
- Leads to overfitting.
2. Bias-Variance Tradeoff
- In machine learning, we want a balance:
- High Bias + Low Variance → Underfitting
- Low Bias + High Variance → Overfitting
- Low Bias + Low Variance → Best Generalization (ideal case)
3. Graphical Representation
Think of shooting arrows at a target (bullseye):
- 🎯 High Bias, Low Variance → All arrows clustered but far from target center (consistently wrong).
- 🎯 Low Bias, High Variance → Arrows spread widely around the center (sometimes right, often inconsistent).
- 🎯 Low Bias, Low Variance → Arrows tightly clustered at the center (perfect).
- 🎯 High Bias, High Variance → Arrows scattered and far off (worst case).
4. Bias vs Variance Error
Error Type | Cause | Effect |
---|---|---|
Bias Error | Wrong assumptions, too simple model | Misses patterns |
Variance Error | Too complex, memorizes noise | Fails to generalize |
💡 Total Error = Bias² + Variance + Irreducible Error
5. Examples
- High Bias (Underfitting)
- Linear regression trying to model curved data.
- High Variance (Overfitting)
- Deep decision tree memorizing training data but failing on new inputs.
6. How to Control Bias & Variance
🔹 To reduce Bias (fix underfitting):
- Use more complex models (deeper NN, ensemble).
- Add more relevant features.
- Reduce regularization.
🔹 To reduce Variance (fix overfitting):
- Get more training data.
- Use simpler models.
- Apply regularization (L1, L2, Dropout).
- Use bagging/ensemble methods (e.g., Random Forest).
✅ In short:
- Bias = Error from wrong assumptions (model too simple).
- Variance = Error from noise sensitivity (model too complex).
- Goal = Find the right bias-variance balance for good generalization.