Senior Platform Engineer (Multi-Cloud & AI Adoption)
Tech Data
Fremont, United States of America
1 month ago
Role details
Contract type
Permanent contract Employment type
Full-time (> 32 hours) Working hours
Regular working hours Languages
English Experience level
SeniorJob location
Fremont, United States of America
Tech stack
Microsoft Windows
API
Artificial Intelligence
Amazon Web Services (AWS)
Azure
Bash
Ubuntu (Operating System)
Cloud Computing
Configuration Management
Continuous Integration
Data Structures
Linux
DevOps
Github
Python
Powershell
Red Hat Enterprise Linux - RHEL
Ansible
Prometheus
Datadog
Scripting (Bash/Python/Go/Ruby)
Google Cloud Platform
Grafana
Technical Debt
Terraform
Ansible Tower
Job description
- Platform as a Product: Treat the platform as a product for our internal developers. Gather feedback, identify bottlenecks, and build self-service tools that eliminate manual tickets.
- Infrastructure & AI Orchestration: Design and scale multi-cloud architectures (Azure, AWS, GCP). Specifically, architect the foundational infrastructure for AI/ML services (e.g., Azure AI Foundry, AWS Bedrock) to ensure they are secure and accessible.
- Automated "Golden Paths": Develop and maintain high-quality Terraform modules and Ansible playbooks that serve as the standard blueprints for the organization.
- Developer Empowerment: Build and integrate with Internal Developer Portals (like Port or Backstage) to provide a single pane of glass for service catalogs, scorecards, and automated resource provisioning.
- CI/CD & GitOps: Modernize our delivery pipelines in Azure DevOps/GitHub, ensuring that infrastructure changes are version-controlled, tested, and deployed with zero downtime.
- Observability & Reliability: Implement "Self-Healing" infrastructure and advanced monitoring (Datadog, Prometheus, Grafana) so developers have full visibility into their own services.
- Strategic Collaboration: Act as a consultant to business units, challenging technical assumptions and translating high-level business needs into elegant, scalable technology execution.
Requirements
- Multi-Cloud & AI Infrastructure
- Cloud Proficiency: Deep hands-on experience in Azure and GCP or AWZ (GCP is a plus).
- AI Stack: Hands-on experience deploying and managing cognitive services (Azure AI Foundry, AWS Bedrock, or Vertex AI).
- Systems: Strong Linux (RHEL, Ubuntu) and Windows Host administration via automated configuration management., * Terraform: Advanced knowledge of Terraform data structures, functions, and state management. Experience building versioned, reusable modules from scratch.
- Ansible: Mastery of Playbooks, Roles, Collections, and Dynamic Inventories. Experience with Ansible Tower or RedHat Satellite for enterprise-scale configuration.
- Scripting: Fluency in Python, Bash, or PowerShell for custom automation and API integrations.
- Delivery & Governance
- DevOps Tools: Advanced use of Azure DevOps (Pipelines, Repos, Boards) and GitHub Actions.
- Security & Compliance: Ability to bake security (Azure Policies, Custom Roles, Encryption) directly into the IaC templates and platform workflows.
Soft Skills & "Port-Inspired" Mindset
- User-Centricity: You view developers or businesses as your customers. You are passionate about improving Developer Experience (DX).
- Analytical Depth: Ability to decompose high-level information into modular tasks and distinguish a user's request from their true underlying need.
- Continuous Improvement: A drive to eliminate technical debt and move toward "Agentic" engineering where AI and automation handle routine operational tasks.