AI Engineer
Role details
Job location
Tech stack
Job description
We are hiring an AI Engineer to be the lead technical contributor on a personalization and ranking engagement for a large-scale consumer marketplace. You will set the technical direction, make the key modeling decisions, and stay hands-on throughout. You will be a senior technical point of contact with the customer - explaining trade-offs, managing expectations, and turning results into clear recommendations. You will lead a rigorous, POC-first program: engineering user-level features from behavioral data, integrating LLM-generated user profiles into a deep-learning ranking model, and driving the work from offline validation through production-readiness.
What You'll Do:
-
Own the technical strategy for a personalization program on a production recommendation/ranking system, making the architecture and modeling decisions and being accountable for the results.
-
Stay hands-on: build the features, train the models, run the experiments, and write the critical code.
-
Set the technical bar and support other engineers through design reviews, mentorship, and pairing.
-
Act as a senior technical point of contact with the customer, communicating progress, risks, and results to both engineers and senior stakeholders, and managing expectations through ambiguity.
-
Design and run a structured, parallel-track proof-of-concept that measures the incremental lift of GenAI-based profiles over well-engineered behavioral ML features.
-
Engineer user-level features from large-scale behavioral data (category/product affinity, time-of-day and price-sensitivity patterns, per-user click/conversion history, recencyfrequency signals).
-
Integrate LLM-generated user profiles into ranking models, including embedding generation, projection-layer tuning, gating, and ablation to ensure the signal is properly weighted.
-
Own the deep-learning ranking model (multi-task CTR/CVR architectures such as sharedbottom MTL), including feature integration, hyperparameter optimization (Bayesian/grid search), and bias correction (position/popularity).
-
Define and run the offline evaluation framework - NDCG, MRR, Precision/Recall at K - with segment-level analysis and ablation studies across user cohorts.
-
Establish the path to production: model serving and scheduled inference integration, shadow-mode testing, A/B framework readiness, and guardrail metrics.
-
Deliver clear technical documentation and lead knowledge-transfer sessions so the customer's teams can operate and iterate independently after handoff.
Requirements
-
10+ years in applied machine learning / data science, with deep hands-on experience in recommender systems, learning-to-rank, or large-scale personalization.
-
Practical experience building with LLMs in production: generating and integrating modelderived features or profiles, working with embeddings, and reasoning about evaluation, latency, and cost.
-
Experience with Amazon Bedrock or comparable managed LLM platforms for production inference.
-
Hands-on experience with segment- or cohort-based personalization, including measuring performance at the segment level rather than relying on aggregate metrics.
-
Experience designing cold-start strategies for users or items with limited history.
-
Strong communication skills - able to explain modeling decisions, trade-offs, and results clearly to engineers, data scientists, and senior business stakeholders, and to manage expectations through ambiguity.
-
Customer-facing or stakeholder-facing experience: building trust, navigating competing priorities, and serving as a senior technical voice in high-stakes conversations.
-
A track record of technical leadership through mentoring engineers, driving design decisions, and setting standards.
-
Strong track record taking ML models from experimentation to production, owning the offline-to-online validation story (ranking metrics, ablations, segment analysis, shadow testing, A/B readiness).
-
Deep, hands-on expertise in deep learning for ranking/recommendation - multi-task learning, embedding-based architectures - with a major framework (TensorFlow or PyTorch).
-
Strong feature engineering on large behavioral datasets using the modern data stack (PySpark, SQL, distributed data lakes).
-
Rigorous experimental methodology - hyperparameter optimization, bias correction, and a disciplined, hypothesis-driven approach to measuring true lift.
-
Hands-on AWS experience across the ML lifecycle, and strong proficiency in Python., * Experience personalizing ranking for marketplaces or consumer platforms at scale (ecommerce, food delivery, media, or similar).
-
MLOps maturity: model versioning, monitoring, and reproducible training pipelines.
-
Advanced degree in Computer Science, Machine Learning, Statistics, or a related quantitative field.
-
Prior experience in a client-facing consulting or professional-services delivery environment.