Senior Data Science Engineer
Role details
Job location
Tech stack
Job description
We are seeking a skilled Data Science Engineer to design, build, and deploy production machine learning solutions for an enterprise Fleet Cascading & Optimization Platform managing 46,000+ vehicles across 545+ locations. In this role, you will develop and operationalize demand forecasting, cascading optimization, contract intelligence (NLP/Vision), and out-of-spec prediction models with a strong focus on explainability and business impact. You will own the end-to-end ML lifecycle - from experimentation and model development to scalable production deployment on AWS-working closely with engineering and business stakeholders to deliver reliable, data-driven outcomes.
Must-Have Requirements
- Programming & ML Frameworks: Python; PyTorch or TensorFlow; scikit-learn; XGBoost or LightGBM; pandas; NumPy
- Time Series & Forecasting: BSTS; Prophet; Temporal Fusion Transformer (TFT); hierarchical forecasting with MinT reconciliation
- Optimization: Linear Programming and MILP using tools such as PuLP and OR-Tools; constraint satisfaction; min-cost flow optimization
- AWS ML Stack: Amazon SageMaker (Training Jobs, Endpoints, Model Monitor, Clarify, Feature Store, Pipelines)
Nice-to-have
- NLP & Document AI: Amazon Textract; LayoutLMv3; Retrieval-Augmented Generation (RAG) pipelines; Amazon Bedrock (Claude); OpenSearch vector databases
- Advanced Machine Learning: Graph Neural Networks (GNNs); Deep Reinforcement Learning; Survival Analysis (Cox Proportional Hazards, XGBoost-Survival); attention-based models
- Explainability & MLOps: SHAP, LIME, Captum; MLflow; A/B testing; champion/challenger frameworks; model and data drift detection
Core Responsibilities
- Build demand forecasting models (XGBoost, BSTS, Temporal Fusion Transformer) with hierarchical reconciliation across 545+ locations
- Develop cascading optimization using MILP/Min-Cost Flow solvers (PuLP, OR-Tools, Gurobi) and Hybrid ML+Optimization pipelines
- Implement document intelligence pipeline: Textract + LayoutLMv3 for document extraction, RAG with Bedrock (Claude) for semantic reasoning
- Deploy models on SageMaker with MLOps (Model Monitor, Feature Store, Pipelines); implement SHAP/LIME explainability
Models You'll Build
- Demand Forecasting: Gradient-boosted models (XGBoost), Bayesian Structural Time Series (BSTS), and Temporal Fusion Transformers (TFT), including hierarchical reconciliation
- Cascading Optimization: Mixed-Integer Linear Programming (MILP) and Min-Cost Flow models, evolving to hybrid ML + solver approaches and advanced Graph Neural Network (GNN) and Deep Reinforcement Learning (DRL) solutions
- Document Intelligence: Automated document extraction using Amazon Textract and LayoutLMv3, advancing to Retrieval-Augmented Generation (RAG) pipelines with Amazon Bedrock and Vision-Language Models
- Survival & Out-of-Spec Prediction: Kaplan-Meier estimators, Cox Proportional Hazards models, and XGBoost-Survival techniques
What we offer
- Continuous learning and career growth opportunities
- Professional training and English/Spanish language classes
- Comprehensive medical insurance
- Mental health support
- Specialized benefits program with compensation for fitness activities, hobbies, pet care, and more
- Flexible working hours
- Inclusive and supportive culture
Requirements
Do you have experience in TensorFlow?