Boston Housing Price Prediction
Summary
I built a complete end-to-end machine learning pipeline in Python using scikit-learn
to predict Boston housing prices. This project demonstrates my ability to perform data cleaning, feature engineering, multivariable regression, diagnostic evaluation, and clear result interpretation — skills I also apply to biological datasets. Visualizations were created with Matplotlib
and Seaborn
for clear insights.
Project Highlights
- Developed a multivariable linear regression model achieving R² > 80% on test data.
- Performed correlation analysis and addressed multicollinearity using
VIF
. - Validated model performance with RMSE, residual plots, and diagnostic graphs.
- Structured the pipeline end-to-end: data preprocessing → feature selection → model training → evaluation → visualization.
- Used
Matplotlib
andSeaborn
for residuals, confidence intervals, and prediction trends.