Lead, Site Reliability Engineer

Royal Caribbean International
Miramar, United States of America
yesterday

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior

Job location

Miramar, United States of America

Tech stack

Artificial Intelligence
Amazon Web Services (AWS)
Amazon Web Services (AWS)
Computing Platforms
Azure
Computer Networks
Continuous Integration
DevOps
Disaster Recovery
Github
Identity and Access Management
Key Management
OpenID
PCI Data Security Standards
Role-Based Access Control
Reliability Engineering
Ansible
SonarQube
TypeScript
DevOps Tools - Open-source
React
System Availability
Mttr
Git Flow
Kubernetes
Information Technology
Functional Programming
Cloudwatch
Terraform
Software Version Control
ServiceNow

Job description

The Lead, Site Reliability Engineer (SRE) provides technical and strategic leadership for Royal Caribbean Group's DevOps and platform engineering ecosystem. This role defines standards, guides platform architecture, and drives enterprise-wide initiatives across CI/CD, Kubernetes, GitOps, observability, security, and AI-enabled automation to support reliable, scalable software delivery. The engineer will lead platform design and evolution, drive intelligent automation, and ensure robust integration of DevOps tooling with business processes, fostering operational excellence and innovation., * Owns SRE and DevOps strategy across AWS and Azure, architecting cloud patterns for high availability, disaster recovery, and cost optimization.

  • Leads Kubernetes/Helm platform design and evolution (EKS, AKS) supporting production workloads.
  • Drives AI-assisted SRE capabilities by identifying opportunities for intelligent automation, remediation, and operational insights across CI/CD and platform operations.
  • Owns the GitHub Actions platform, designing reusable workflows and enforcing fully automated end-to-end pipelines.
  • Mandates Snyk and SonarQube in all pipelines, enforcing security gates, quality thresholds, and exemption workflows.
  • Integrates Terraform IaC execution directly within CI/CD, ensuring infrastructure changes flow through automated controls.
  • Owns Backstage lifecycle, including catalog, scaffolder templates, plugin integrations, and adoption governance.
  • Builds Software Templates that pre-wire CI/CD, Terraform modules, and security tooling for new services from day one.
  • Owns pipeline-to-ServiceNow integration, automating change/release records and gating deployments against approved change windows.
  • Leads, mentors, and grows a team of SRE and DevOps engineers, owning technical escalation and platform SLAs/SLOs.
  • Drives engineering culture through blameless post-mortems, runbooks, documentation, and operational excellence.

Requirements

  • Bachelor's degree in Computer Science, Engineering, or related field required; Master's degree preferred.

  • 7+ years in SRE/DevOps/Platform Engineering, with at least 2+ years in a technical lead or staff-level role.

  • Deep expertise in AWS (EKS, EC2, IAM, Lambda, CloudWatch) and Azure (AKS, Entra ID, Azure Monitor).

  • Expert in Terraform (modules, remote state, pipeline-automated execution, GitOps workflows).

  • Advanced proficiency with GitHub Actions (multi-job workflows, reusable actions, OIDC, secrets management).

  • Production Kubernetes experience (cluster lifecycle, Helm authoring, RBAC, network policies).

  • Hands-on experience with Backstage (catalog config, scaffolder templates, plugin integration, governance).

  • Demonstrated Snyk and SonarQube pipeline integration with enforced security and quality gates.

  • Experience integrating DevOps tooling with ServiceNow change, release, or Digital Release.

  • Proven track record reducing deployment lead time, MTTR, or improving platform reliability.

  • Hospitality, travel, or high-volume consumer tech experience.

  • AWS Solutions Architect Professional; CKA/CKAD certifications a strong plus.

  • Experience with GitOps tooling (ArgoCD, Flux) and progressive delivery (canary, blue/green).

  • Backstage plugin development (TypeScript/React).

  • PCI-DSS, SOC 2, or travel industry compliance background.

  • Source Control: Git / GitHub

  • CI/CD: GitHub Actions

  • IaC: Terraform, Ansible

  • Containers: Kubernetes / Helm (EKS, AKS)

  • Cloud: AWS and Azure

  • Dev Portal: Backstage

  • Security: Snyk, SonarQube

  • Release: ServiceNow Digital Release

  • Effective mentor and collaborator, able to build capability and drive adoption.

  • Strong interpersonal skills to communicate with all levels of management.

  • Ability to work independently and as part of a cross-functional team.

About the company

Journey with us! Combine your career goals and sense of adventure by joining our exciting team of employees. Royal Caribbean Group is pleased to offer a competitive compensation and benefits package, and excellent career development opportunities, each offering unique ways to explore the world. We are proud to be the vacation-industry leader with global brands - including Royal Caribbean International, Celebrity Cruises and Silversea Cruises - the most innovative fleet and private destinations, and the best people. Together, we are dedicated to turning the vacation of a lifetime into a lifetime of vacations for our guests., It is the policy of the Company to ensure equal employment and promotion opportunity to qualified candidates without discrimination or harassment on the basis of race, color, religion, sex, age, national origin, disability, sexual orientation, sexuality, gender identity or expression, marital status, or any other characteristic protected by law. Royal Caribbean Group and each of its subsidiaries prohibit and will not tolerate discrimination or harassment.

Apply for this position