Operations Engineer - Cloud Native DevOps Platform
Role details
Job location
Tech stack
Job description
Operations Engineer - Cloud Native DevOps Platform
Project Overview
The project focuses on building and operating an internal cloud-native developer platform designed to accelerate the development, deployment, and operation of software products across a hybrid cloud environment.
The platform provides self-service capabilities for application teams, enabling them to manage infrastructure, CI/CD pipelines, artifact repositories, and application lifecycle services while reducing operational overhead through automation and standardized DevOps tooling.
The DevOps for Applications (D4A) team is responsible for maintaining and evolving core platform services such as CI automation, artifact management, and developer self-service tools to ensure reliable and scalable software delivery., Incident Management & Operational Stability
- Provide expert support during critical incidents across production and staging environments.
- Perform root cause analysis and advanced troubleshooting.
- Ensure operational stability and reliability of containerized cloud infrastructure.
️ Kubernetes Platform Operations
- Maintain and optimize Kubernetes-based deployments across platform environments.
- Ensure high availability and reliability of containerized workloads.
- Manage cluster operations within a cloud-native ecosystem.
️ CI/CD & DevOps Toolchain
- Maintain and operate CI/CD pipelines and automation frameworks.
- Support artifact repositories and image registries used across the developer platform.
- Ensure seamless operation of DevOps tools within the SDLC lifecycle.
Security & Compliance
- Integrate security best practices into daily platform operations.
- Manage configuration and policy enforcement across DevOps tooling.
- Support secure software delivery and compliance with platform security standards.
Observability & Monitoring
- Drive observability practices across the D4A platform.
- Monitor system performance and identify bottlenecks proactively.
- Implement improvements to ensure reliable platform performance.
Process Improvement & Automation
- Identify inefficiencies in deployment and operational processes.
- Improve automation and streamline development workflows.
- Establish new operational processes when necessary to enhance platform efficiency.
Stakeholder Collaboration
- Act as a technical consultant for internal stakeholders.
- Support engineering teams in resolving complex operational challenges.
- Coordinate with internal and external stakeholders during incident resolution.
Requirements
Do you have experience in Shell Scripting?, * Proven experience in DevOps, SRE, or Operations Engineering roles.
- Strong knowledge of Kubernetes and cloud-native ecosystems.
- Experience with Infrastructure-as-Code (IaC) and automation practices.
- Experience with Git-based CI/CD systems and deployment tooling.
- Solid understanding of GitOps principles and containerized environments.
- Experience with Linux administration and scripting (Bash, Python, or Go).
- Knowledge of observability and monitoring tools.
- Experience working in enterprise-scale or multi-cluster environments.
- Experience with process optimization and change management.
Preferred Technical Stack
Deployment & GitOps
- ArgoCD
- Opentofu Controller
- Kyverno Operator
- Harbor
CI/CD & Source Control
- GitLab
- Harness
Platform & Developer Tools
- Jira
- Jira Service Management
- Confluence
Additional Tools
- FluxCD
- Velero
- JFrog Artifactory
- Backstage
Languages
- Fluent English (C1) - written and spoken.