This project predicts tuition costs for 600+ U.S. colleges using engineered institutional and geographic features.
Highlights
- Dataset: 600+ U.S. college records
- Models: Ridge Regression, Gradient Boosting Regressor, Neural Network (PyTorch)
- Evaluation: 7-fold cross-validation
- Team rank: 2nd out of 13 teams
Performance
- Best RMSE: 5,860.88
- R2 score: 0.799
- Baseline RMSE: 13,086.66
- Improvement over baseline: 55%
Learnings
- Engineered features (region and institution type) were high impact.
- Ridge regression helped regularize high-dimensional feature sets.
- Gradient boosting outperformed linear baselines by capturing nonlinear tuition patterns.