Platform Reliabilty Engineer
Role details
Job location
Tech stack
Job description
We are looking for a Platform Reliability Engineer to join a Private Cloud team within a large enterprise environment. In this role, you will be responsible for designing, building, and improving a scalable and reliable platform that supports critical infrastructure services. The team operates in a DevOps-driven setup, where automation, standardization, and performance are key to delivering high-quality services across the organization.
As a Platform Reliability Engineer, you will work on a platform that enables internal teams to deploy and manage infrastructure efficiently. You will contribute to the development of CI/CD pipelines, improve platform reliability through monitoring and incident handling, and help drive automation across the landscape. The environment is highly collaborative, with self-organizing teams working closely together to continuously improve the platform and developer experience.
About the role
In this role, you will focus on building and maintaining a reliable private cloud platform while enabling teams to consume infrastructure in a standardized and automated way. You will be working with modern technologies and practices, with a strong emphasis on DevOps and platform engineering.
You will:
- Build & operate the platform: Manage private cloud components and CI/CD pipelines, ensuring reliability, security, and performance.
- Enable self-service: Make the platform consumable via templates, catalogues, guardrails, and standardization.
- Own platform reliability: Drive monitoring, incident response, and continuous reliability improvements of the platform itself.
- Security & compliance by design: Implement RBAC, manage secrets, and ensure platform compliance.
- Optimize scalability & performance: Continuously improve cluster capacity, platform performance, and cost-efficiency.
- Collaborate & support: Work closely with other delivery teams to troubleshoot issues and improve developer experience.
- Participate in on-call rotations: Support platform availability and respond to incidents when required.
The role offers a flexible working setup, with a hybrid model and a strong focus on collaboration and continuous improvement within a modern infrastructure organization.
Requirements
- Advanced knowledge of ServiceNow.
- Experience with CI/CD pipelines and DevOps practices
- Knowledge of containerization technologies (e.g., Docker, Kubernetes)
- Knowledge of Ansible
- Strong understanding of virtualization technologies and best practices.
- Knowledge about Infrastructure Services in general