Car Price Prediction Using Bayesian GLM
Advanced Bayesian Data Analysis Project | TU Dortmund | 2024
Car Price Prediction Using Bayesian GLM
Advanced Bayesian Data Analysis Project | TU Dortmund | 2024
Quick Summary
We built and compared three Bayesian regression models using GLMs to predict car prices based on a rich dataset of 205 cars. We used brand indicators (BMW, Audi, Toyota), technical specs, and fuel type as predictors — all modeled using Bayesian methods with informative priors and evaluated via Bayes Factors and WAIC.
Tools & Techniques
Language: R (v4.2.2)
Libraries: brms, rstan, loo, ggplot2, dplyr
Model Type: Bayesian GLM
Evaluation: Bayes Factor, WAIC, Posterior Predictive Checks
Models Developed
We built 3 models focusing on different car brands BMW, AUDI, and TOYOTA
Best performing model.
Key predictors: BMW, engine size, car width, curb weight, horsepower.
Bayes Factor and WAIC showed strong performance over others.
Moderate performance .
Engine size, horsepower, and peak RPM had strong positive effects .
Focused on Toyota brand.
Slightly lower prediction strength compared to BMW model.
Model Evaluation Highlights
Model 1 (BMW) had:
Lowest WAIC: 196.4
Strong Bayes Factor advantage over others
🔄 Posterior predictive checks confirmed the robustness
📉 Residual plots were analyzed for model diagnostics
Key Takeaways
Data preprocessing (one-hot encoding, standardization) was crucial
Bayesian priors bring interpretability and regularization
Engine size and fuel type consistently influenced prices
Model validation with WAIC and Bayes Factor helped in fair selection
Reflection
This project taught us not only how to model but how to think in terms of uncertainty. It deepened our understanding of Bayesian inference and how model diagnostics and prior choices shape trustworthy conclusions.