Senior Software Engineer (Machine Learning )

Shield AI
San Diego, United States of America
yesterday

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior

Job location

Remote
San Diego, United States of America

Tech stack

Java
API
Artificial Intelligence
Algorithm Design
Systems Engineering
Artificial Neural Networks
Big Data
C++
Continuous Integration
Information Engineering
Data Structures
Distributed Computing Environment
Distributed Systems
E-Business
Statistical Hypothesis Testing
Python
Machine Learning
Object-Oriented Software Development
Recommender Systems
TensorFlow
Scala
Software Engineering
SQL Databases
Management of Software Versions
Supervised Learning
Freeform SQL
PyTorch
Large Language Models
Spark
Deep Learning
Data Lineage
Apache Flink
ONNX (Open Neural Network Exchange) Format
Production Code
XGBoost
Kafka
Machine Learning Operations
TensorRT
Go
Microservices

Job description

We're hiring a Senior Software Engineer (Machine Learning) to architect, build, and deploy high-performance machine learning systems that power technology stack. You will work across the entire ML lifecycle-from processing massive volumes of data to developing and deploying low-latency models., Scale Data Engineering & Feature Pipelines

  • Process and extract features from massive, highly sparse datasets (terabytes/petabytes of bidstream and user event data) using SQL, Python, and distributed computing frameworks (e.g., Spark, Ray)
  • Architect offline and online feature pipelines. Manage real-time feature computation and low-latency feature stores ensuring zero online/offline skew
  • Perform rigorous missingness analysis, leakage checks, and handle high-cardinality categorical variables safely

Core ML & Deep Learning Development

  • Train, tune, and scale supervised learning models, utilizing advanced gradient boosting (XGBoost, LightGBM, CatBoost) and Factorization Machines
  • Design and implement Deep Learning architectures for structured/recommendation data using PyTorch or TensorFlow
  • Apply rigorous tabular modeling practices: meticulous leakage prevention, class imbalance strategies, and robust cross-validation on time-split data

Productionization, MLOps, & System Engineering

  • Write clean, object-oriented, and modular production code. Transition models from Python research environments to high-performance serving environments (packaging with ONNX, TensorRT, etc)
  • Design and maintain robust MLOps pipelines: automated model retraining, versioning, shadow deployments, and CI/CD for machine learning
  • Monitor production models for data drift, concept drift, and performance degradation in real-time, implementing automated alerting and fallback mechanisms

Evaluation & Experimentation

  • Design rigorous A/B and multivariate tests to measure the true business incrementality of ML models
  • Choose appropriate offline metrics (PR-AUC, normalized Entropy/LogLoss, Calibration, Lift) and bridge them to online business KPIs

Success in This Role Looks Like

  • You deliver models that perform well and move business metrics (revenue lift, cost reduction, risk reduction, improved forecast accuracy, operational efficiency)
  • Your work is reproducible and production-aware: clear data lineage, robust evaluation, and a credible path to deployment/monitoring
  • Stakeholders trust your judgment in selecting methods and communicating uncertainty honestly

Requirements

You must possess a strong hybrid skill set: deep expertise in applied machine learning combined with production-grade software engineering skills. You will not just build models in notebooks; you will write scalable, production-ready code, design real-time inference APIs, and ensure your systems meet strict latency and high-throughput requirements., * 5-8+ years of experience as a Machine Learning Engineer or Software Engineer focusing on ML systems, ideally within Ad Tech, MarTech, or high-scale recommendation systems

  • Production Engineering Skills: Strong software engineering fundamentals (OOP, data structures, algorithm design). Expert-level Python and strong proficiency in a compiled or high-performance language (e.g., C++, Java, Scala, Go, or Rust)
  • ML Systems & Serving: Deep experience deploying machine learning models into highly concurrent, low-latency production environments (APIs, microservices, Triton Inference Server, custom containers)
  • Distributed Computing: Hands-on experience with big data processing (Apache Spark, Kafka, Flink) and complex SQL queries
  • Core ML & Deep Learning: Proven track record of shipping both tree-based models and neural networks (PyTorch/TensorFlow) to production
  • Statistics & Experimentation: Solid grasp of statistics, hypothesis testing, and rigorous A/B experiment design

Nice-to-Have

  • Agentic / GenAI Development: Experience designing agentic workflows or utilizing LLMs to automate ad creative generation, campaign copilot tools, or internal ML development workflows (AI-assisted IDEs, code agents)

About the company

Fusemachines is a leading AI strategy, talent, and education services provider. Founded by Sameer Maskey Ph.D., Adjunct Associate Professor at Columbia University, Fusemachines has a core mission of democratizing AI. With a presence in 4 countries (Nepal, the United States, Canada, and the Dominican Republic) and more than 450 full-time employees, Fusemachines brings global AI expertise to transform companies worldwide. Founded in 2013, Fusemachines is a global provider of enterprise AI products and services, on a mission to democratize AI. Leveraging proprietary AI Studio and AI Engines, the company helps drive the clients' AI Enterprise Transformation, regardless of where they are in their Digital AI journeys. With offices in North America, Asia, and Latin America, Fusemachines provides a suite of enterprise AI offerings and specialty services that allow organizations of any size to implement and scale AI. Fusemachines serves companies in industries such as retail, manufacturing, and government. Fusemachines continues to actively pursue the mission of democratizing AI for the masses by providing high-quality AI education in underserved communities and helping organizations achieve their full potential with AI., About Shield AI 201-500

Apply for this position