Loan Default Prediction & Return Optimization
Decision-aware credit risk framework for consumer lending — moving beyond binary default prediction toward return-optimized portfolio strategy using ensemble and neural net models on LendingClub data.
I build the metric foundations, retention models, and experiment frameworks that NYC's fintech and SaaS teams use to make decisions. Carnegie Mellon MISM Graduate, Distinction - 3+ years across product analytics, ML engineering, and BI.
Core stack: SQL, Python, Tableau, and XGBoost on a B2B SaaS fintech platform serving 30K+ merchants. Built SQL data foundation over 10M+ monthly events, standardizing retention KPIs. Developed XGBoost retention model (AUC 0.82) improving quarterly retention 11%. Built time-series baselines reducing intervention escalations 28%.
Led data analytics and ML engineering for Coca-Cola Bottling's retail network — $1.2M incremental revenue via cohort-based recommendation engine across 30K+ locations. Reduced stockout losses 9% across 50K+ stores via demand forecasting. Led migration to Azure Databricks with Airflow + MLflow.
Built regression models over 100K+ behavioral records forecasting crypto adoption, improving 3-month accuracy ~20%. TA for Applied Econometrics — causal inference and hypothesis testing in R.
Decision-aware credit risk framework for consumer lending — moving beyond binary default prediction toward return-optimized portfolio strategy using ensemble and neural net models on LendingClub data.
SQL + statistical modeling over demographic datasets to surface voter turnout patterns. Interactive Tableau dashboards for non-technical stakeholders.
Production ML pipeline with drift monitoring and automated retraining — the same architecture used for capacity planning and staffing models in operations-heavy industries. Sub-5% forecast error sustained under changing conditions.
Lightweight CV system for autonomous driving contexts. Optimized inference and cloud-edge integration for edge device deployment.
AI literacy platform for children — gamified reading with LLMs and RAG delivering safe, age-appropriate content dynamically.
Speculative decoding, pruning, and quantization for LLaMA2 and GPT-2 in PyTorch — benchmarking accuracy–performance tradeoffs for CPU and GPU deployment.