AI/ML Engineer
Role details
Job location
Tech stack
Job description
The AI/ML Engineer is a core member of the AI/AWS Team responsible for designing and deploying machine learning models and data pipelines on the cloud/AWS. This role focuses on the technical setup, configuration, and management of large-scale data ingestion and transformation processes on the cloud/AWS, leveraging automation to accelerate the data science and ML operationalization (MLOps) journey., * Data Transformation and ML Pipeline Management
-
Configure and manage the cloud/AWS services (e.g., AWS Glue, Sagemaker Data Wrangler) to set up data connectors and execute large-scale data transformation jobs.
-
Select and execute AI/ML capabilities such as feature engineering, data quality checks, model training, performance analysis, and model deployment pipelines (MLOps).
-
Review and assess model training job outputs, including feature importance reports, data drift metrics, and model performance baselines, to inform deployment decisions.
-
Platform and Infrastructure
-
Set up and secure the cloud/AWSaccounts, S3 buckets, and configure necessary IAM permissions to enable secure data transfer and access for ML workflows.
-
Provision and manage target cloud infrastructure for ML model serving and data processing using Infrastructure as Code (IaC) templates (AWS CloudFormation, the cloud/AWS Cloud Development Kit (CDK), or Terraform).
-
Manage CI/CD/CD (or MLOps) pipelines to facilitate the deployment and continuous integration of models and microservices.
-
Model and Data Handling
-
Organize and manage large datasets and required code artifacts-including training data, feature stores, Python scripts, and Jupyter notebooks-into secure data repositories (e.g., S3).
-
Develop and review production-grade model code and associated scripts (e.g., for inference) to ensure performance and maintainability, optionally enabling monitoring tools for model quality and drift detection.
-
Model Testing and Validation
-
Generate test artifacts, including model validation metrics and test automation scripts, to support functional and performance testing of deployed ML models.
Requirements
Do you have experience in Tooling?, * Experience configuring and managing the cloud/AWSservices, specifically Amazon S3 and IAM permissions, and ML services like Amazon SageMaker, within an enterprise environment.
-
Technical understanding of machine learning principles, model lifecycle management, and MLOps practices.
-
Proficiency with Infrastructure as Code (IaC) tooling, such as AWS CloudFormation, AWS CDK, or Terraform.
-
Knowledge of cloud-native development and deployment practices, including microservices, CI/CD, and AWS compute services (ECS, EKS, Lambda, Fargate).
-
Familiarity with data transformation and processing methodologies (e.g., Spark, AWS Glue, EMR) and the phases of the ML lifecycle (Data Prep, Training, Tuning, Deployment, Monitoring).