Brooklyn, New York & Dallas, TX

Sushanth
Data Science & Analytics

I build ML systems and analytical pipelines that turn complex data into decisions, from risk scoring models in FinTech to NLP research and forecasting at Carnegie Mellon. 3+ years across ML engineering, data science, and financial analytics.

PythonSQLdbt Time Series ForecastingDatabricksMachine Learning
$1.2M
Revenue impact
10M+
Events / month
30k+
Accounts modeled
Certified
AWS Cloud Practitioner
Databricks Data Analyst Associate
AWS AI Practitioner
Career

Experience

Bluebird Technologies
Apr 2025 – Present · New York City, NY
Data Analyst — Applied ML

Built production ML systems across 10M+ monthly merchant transactions — XGBoost risk scoring model (AUC 0.82), RAG-powered NLP pipelines for signal extraction, and time-series forecasting baselines that cut escalations by 28%. Owned end-to-end ML workflow in Databricks and Snowflake across 30K+ accounts.

SQLPythonXGBoostNLPRAGDatabricksSnowflakeTime-Series
Quantiphi Inc.
Feb 2021 – Jul 2023 · Mumbai
Machine Learning Engineer

Led data analytics and ML engineering for Coca-Cola Bottling's retail network — $1.2M incremental revenue via cohort-based recommendation engine across 30K+ locations. Reduced stockout losses 9% across 50K+ stores via demand forecasting. Led migration to Azure Databricks with Airflow + MLflow.

PySparkDatabricksAirflowMLflow
Carnegie Mellon University
2024 · Pittsburgh, PA
Graduate Research Analyst

Built regression models over 100K+ behavioral records forecasting crypto adoption, improving 3-month accuracy ~20%.

PythonSQLForecastingTableau
Carnegie Mellon University
2024 · Pittsburgh, PA
Graduate Teaching Assistant

TA for Applied Econometrics — causal inference and hypothesis testing in R.

RCausal InferenceEconometrics
Work

Projects

Credit Risk Analytics · FinTech

Loan Default Prediction & Return Optimization

Decision-aware credit risk framework for consumer lending — moving beyond binary default prediction toward return-optimized portfolio strategy using ensemble and neural net models on LendingClub data.

PythonXGBoostRisk Analytics
AUC0.91
Profit ↑18%
GitHub →
Financial Analytics · CMU

Currency Exchange Analytics & Forecasting

Built a currency exchange analytics platform using live financial market data — ARIMA time-series models to forecast return patterns and optimal conversion timing, delivered through an interactive interface.

PythonARIMATime-SeriesFinancial Data
ModelsARIMA
GitHub →
Operational Analytics · MLOps

Real-Time Demand Forecasting Pipeline

Production ML pipeline with drift monitoring and automated retraining — the same architecture used for capacity planning and staffing models in operations-heavy industries. Sub-5% forecast error sustained under changing conditions.

PythonMLflowDocker
MAPE<5%
GitHub →
Computer Vision · Autonomous Systems

Driver Safety — Hazard Detection

Lightweight CV system for autonomous driving contexts. Optimized inference and cloud-edge integration for edge device deployment.

PyTorchCVEdge Deployment
Accuracy ↑18%
Latency ↓22%
GitHub →
EdTech · Generative AI — In Progress

Bookwormy

AI literacy platform for children — gamified reading with LLMs and RAG delivering safe, age-appropriate content dynamically.

RAGLLMsEdTech
StatusBuilding
AI Research · CMU

LLM Inference Optimization

Speculative decoding, pruning, and quantization for LLaMA2 and GPT-2 in PyTorch — benchmarking accuracy–performance tradeoffs for CPU and GPU deployment.

PyTorchLLMsQuantization
Speedup1.2×
Retention94%
GitHub →
Outside Work

Creative

Art
Happy
Acrylic · 2024
Friend
Acrylic · 2024
Light
Digital · 2025
Writing
Manipal The Talk Network
Sub-Head of Writing
Read
Cultural Fest '19 · MIT Manipal
Head of Content
Read
The Editorial Board · Manipal
Writer
Read
Get in touch

Contact