Loan Default Prediction & Return Optimization
Decision-aware credit risk framework for consumer lending — moving beyond binary default prediction toward return-optimized portfolio strategy using ensemble and neural net models on LendingClub data.
I build ML systems and analytical pipelines that turn complex data into decisions, from risk scoring models in FinTech to NLP research and forecasting at Carnegie Mellon. 3+ years across ML engineering, data science, and financial analytics.
Built production ML systems across 10M+ monthly merchant transactions — XGBoost risk scoring model (AUC 0.82), RAG-powered NLP pipelines for signal extraction, and time-series forecasting baselines that cut escalations by 28%. Owned end-to-end ML workflow in Databricks and Snowflake across 30K+ accounts.
Led data analytics and ML engineering for Coca-Cola Bottling's retail network — $1.2M incremental revenue via cohort-based recommendation engine across 30K+ locations. Reduced stockout losses 9% across 50K+ stores via demand forecasting. Led migration to Azure Databricks with Airflow + MLflow.
Built regression models over 100K+ behavioral records forecasting crypto adoption, improving 3-month accuracy ~20%.
TA for Applied Econometrics — causal inference and hypothesis testing in R.
Decision-aware credit risk framework for consumer lending — moving beyond binary default prediction toward return-optimized portfolio strategy using ensemble and neural net models on LendingClub data.
Built a currency exchange analytics platform using live financial market data — ARIMA time-series models to forecast return patterns and optimal conversion timing, delivered through an interactive interface.
Production ML pipeline with drift monitoring and automated retraining — the same architecture used for capacity planning and staffing models in operations-heavy industries. Sub-5% forecast error sustained under changing conditions.
Lightweight CV system for autonomous driving contexts. Optimized inference and cloud-edge integration for edge device deployment.
AI literacy platform for children — gamified reading with LLMs and RAG delivering safe, age-appropriate content dynamically.
Speculative decoding, pruning, and quantization for LLaMA2 and GPT-2 in PyTorch — benchmarking accuracy–performance tradeoffs for CPU and GPU deployment.