Azure Stack AI DevOps Specialist

VAK CONSULTING LLC
Chicago, United States of America
2 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English

Job location

Chicago, United States of America

Tech stack

Artificial Intelligence
Application Performance Management
Azure
Bash
DevOps
Field-Programmable Gate Array (FPGA)
Github
Python
Machine Learning
Performance Tuning
Powershell
Role-Based Access Control
Azure
Ansible
TensorFlow
Azure
Smart Devices
Cloud Platform System
HybridCloud
Kubernetes
Bicep
Azure
Machine Learning Operations
Hardware Infrastructure
Terraform
Software Version Control
Azure
Bamboo
Docker
Jenkins

Job description

The Azure Stack AI DevOps Specialist designs, implements, and manages CI/CD pipelines for Al and Machine Learning applications specifically hosted on Azure Stack infrastructure. You ensure that infrastructure is treated as code (laC) and that Al models are seamlessly deployed, monitored, and retrained in hybrid cloud environments. Key Roles & Responsibilities

  1. Hybrid Infrastructure Management Provisioning: Use Terraform or Bicep to automate the setup of Azure Stack Hub or Edge resources Scalability: Configure GPU-enabled nodes on Azure Stack to handle intensive Al/ML workloads Governance: Implement Azure Policy and Role-Based Access Control (RBAC) to maintain security across on-premises and cloud environments

  2. MLOPs & CI/CD Pipelines Automation: Build end-to-end pipelines using Azure Pipelines or GitHub Actions to automate model training, testing, and deployment Model Versioning: Manage model artifacts and datasets to ensure reproducibility of Al results Edge Deployment: Orchestrate the deployment of Al models to Azure Stack Edge devices using loT Edge and Kubernetes (AKS)

  3. Monitoring and Optimization Observability: Implement Azure Monitor and Application Insights to track the health of both the infrastructure and the Al model's performance (e.g., detecting data drift). Performance Tuning: Optimize resource allocation for containers running Al inference to reduce latency at the edge. 4.Security & Compliance DeySecOps: Integrate security scanning into the pipeline to check for vulnerabilities in container images and Al libraries. Data Residency: Ensure that Al processing complies with local data residency laws by keeping sensitive data on the Azure Stack Hub within the local datacenter. Technical Skill Requirements Category Key Tools & Skills Cloud Platforms Azure Stack Hub, Azure Stack Edge, Azure Stack HCI DevOps Tools Azure DevOps, GitHub Actions, Jenkins laC & Configuration Terraform, Bicep, ARM Templates, Ansible Containers Docker, Azure Kubernetes Service (AKS) on Stack Al/ML Frameworks Azure Machine Learning, ByJorch, TensorFlow, MLflow Scripting Python (crucial for Al), PowerShell, Bash

Requirements

Key Differences from a Standard Azure DevOps Role Connectivity Awareness: You must design systems that can function in disconnected or low- bandwidth scenarios (common in Azure Stack environments). Hardware Knowledge: Understanding the physical constraints of Azure Stack Edge (like FPGA or GPU capabilities) is necessary for optimizing Al models.

Apply for this position