CTEC 426 — Data-Driven Used Car Pricing Analysis
This project focuses on developing a data-driven approach to assessing fairness and pricing accuracy in the used car market. Using Austin Reese’s national Craigslist dataset of over 426,000 used car listings, the project analyzes how factors like age, mileage, condition, usage intensity, and vehicle type influence real-world vehicle pricing.
I performed extensive data cleaning, feature engineering, and market segmentation to create a refined dataset and extract actionable insights. New variables such as vehicle age, miles per year, and Edmunds-based price tiers were engineered to improve interpretability and help identify fair pricing ranges, potential overpricing, and market behavior among different manufacturers and vehicle categories.
What I demonstrated:
- Large-scale dataset cleaning and refinement (over 426k entries)
- Feature engineering (Age, Miles-Per-Year, Price Tiers)
- Outlier detection and realistic boundary filtering
- Correlation analysis of major pricing factors
- Market segmentation across vehicle types and conditions
- Identification of fair-value trends and pricing anomalies
- Proposal for a future FairPrice automated valuation tool
- Project Repository Link: View Full Project on GitHub