1. Definition
Regression is a supervised learning technique used to predict a continuous numerical value based on input features.
- Unlike classification (predicts categories), regression predicts quantitative outcomes.
👉 It answers: “How much?” or “What value?”
2. Key Characteristics
- Input: Independent variables (features, X).
- Output: Dependent variable (continuous, Y).
- Goal: Learn the relationship between input features and output values.
3. Types of Regression Models
- Linear Regression
- Models a straight-line relationship between inputs and output.
- Example: Predicting house prices based on square footage.
- Multiple Linear Regression
- Uses more than one independent variable.
- Polynomial Regression
- Fits a curve instead of a straight line.
- Ridge and Lasso Regression (Regularization)
- Prevents overfitting by penalizing large coefficients.
- Logistic Regression(technically classification)
- Despite the name, it predicts probabilities of categorical outcomes.
- Non-linear / Advanced Models
- Decision Trees, Random Forests, Gradient Boosting, and Neural Networks can also perform regression.
4. Performance Metrics
- Mean Absolute Error (MAE) → Average of absolute errors.
- Mean Squared Error (MSE) → Average of squared errors (penalizes big errors).
- Root Mean Squared Error (RMSE) → Square root of MSE.
- R² (Coefficient of Determination) → Explains variance captured by the model.
5. Applications
- 🏠 House Price Prediction (real estate).
- 📈 Stock Market Forecasting.
- 🚗 Car Mileage Prediction (MPG).
- 👩⚕️ Disease Progression Estimation (blood sugar, tumor growth, etc.).
- 📊 Sales Forecasting.
- 🌦 Weather Prediction (temperature, rainfall).
6. Challenges
⚠️ Outliers can heavily impact predictions.
⚠️ Overfitting with complex models.
⚠️ Assumes relationships that may not exist (linear regression assumes linearity).
⚠️ Requires sufficient, good-quality data.
✅ In short: Regression = Predicting continuous values using patterns learned from labeled data.