Senior Machine Learning Engineer
Role details
Job location
Tech stack
Job description
As a Senior Machine Learning Engineer at Kyndryl's AI Innovation Hub, you'll be part of the team that transforms ideas and models into scalable, production-grade AI solutions. Working alongside architects and data scientists, you'll design and optimize the data and ML pipelines that power intelligent systems across industries. Your mission will be to turn experimental models into efficient, reliable, and maintainable products - bridging the gap between innovation and execution. You'll work in an environment where automation, engineering excellence, and curiosity converge, driving the continuous evolution of our AI capabilities. This is a role for those who combine strong technical craftsmanship with a builder's mindset and a passion for making AI work in the real world.
Your Mission
- Build and optimize end-to-end ML pipelines, ensuring scalability, efficiency, and reproducibility.
- Collaborate with data scientists and architects to bring models from prototype to production, integrating them seamlessly into enterprise systems.
- Automate the full model lifecycle - from data ingestion and training to validation, deployment, and monitoring.
- Implement MLOps best practices, ensuring robust CI/CD, testing, and observability across AI workloads.
- Contribute to the Hub's technical excellence by evaluating emerging tools, frameworks, and methodologies in ML engineering.
- Champion software engineering standards, code quality, and documentation to ensure reliability and maintainability.
- Continuously improve performance, resource efficiency, and operational resilience of deployed models.
- Collaborate with cross-functional teams to align AI solutions with business goals and enterprise architecture standards. Who You Are
Requirements
-
3-5 years of experience developing and deploying AI/ML models in production environments.
-
Strong proficiency in Python and major ML libraries (TensorFlow, PyTorch, Scikit-learn, XGBoost, etc.).
-
Hands-on experience with MLOps frameworks (MLflow, Kubeflow, Airflow, DVC) and CI/CD automation(GitHub Actions, Jenkins, Azure DevOps).
-
Experience with containerization and orchestration (Docker, Kubernetes).
-
Solid understanding of cloud AI platforms (Azure ML, Vertex AI, SageMaker, OpenShift AI).
-
Proven skills in data preprocessing, cleaning, and versioning using DataOps practices.
-
Experience monitoring and maintaining models in production (data drift, model drift, retraining, observability).
-
Familiarity with relational, NoSQL, and vector databases (SQL, MongoDB, FAISS, Milvus, ChromaDB).
-
Understanding of security, compliance, and FinOps principles in large-scale AI workloads. Education & Certifications
-
Bachelor's or Master's degree in Computer Engineering, Data Science, Mathematics, Physics, or related field.
-
Postgraduate studies (Master's in Artificial Intelligence, Data Science, or Software Engineering) are highly valued.
-
Certifications in cloud platforms (Azure, AWS, GCP) or MLOps frameworks are a plus.
-
Proven commitment to continuous learning and staying up to date with advances in AI engineering and automation. Preferred Skills
-
Experience working with LLMs, RAG architectures, or multi-agent systems.
-
Knowledge of feature engineering, data lineage, and metadata management for ML pipelines.
-
Exposure to streaming data and real-time model serving.
-
Understanding of microservice-based architectures and API design for AI integrations.
-
Familiarity with observability tools (Prometheus, Grafana).
-
Ability to design reusable components and templates for rapid experimentation and deployment.
-
Passion for automation, optimization, and reproducibility in ML workflows. Soft Skills
-
Collaborative mindset, working effectively with architects, data scientists, and developers toward shared goals.
-
Strong analytical thinking and problem-solving abilities, balancing rigor with creativity.
-
Clear communication, able to explain technical topics to both experts and non-specialists.
-
Attention to detail and dedication to high-quality, maintainable, and well-documented code.
-
Result-oriented approach, focused on delivering impactful, production-ready solutions.
-
Curiosity and initiative, continuously exploring new frameworks, methodologies, and emerging AI tools.