Infrastructure Engineer
Role details
Job location
Tech stack
Job description
We are seeking an experienced Infrastructure Engineer with strong expertise in Google Cloud Platform (GCP) and event-driven architecture to support the design, implementation, and operational management of a scalable enterprise event bus platform. This role will focus on building and maintaining GCP-based event streaming infrastructure using Pub/Sub, delivering a robust schema registry capability, and enabling secure, auditable, and replayable messaging patterns across multiple teams and systems. The ideal candidate combines cloud infrastructure engineering expertise with strong operational practices and experience supporting distributed event-driven systems in production environments. Key Responsibilities Design, provision, and manage GCP Pub/Sub infrastructure for enterprise-scale event streaming Implement hardening, security, monitoring, and operational best practices for messaging platforms Deliver and maintain a schema registry solution to support cross-team data contract governance and compatibility management Externalise schema management capabilities for broader organisational consumption and self-service adoption Build and support event bus infrastructure enabling: o decoupled system integration o auditable event flows o replayable messaging patterns o resilient asynchronous communication Develop infrastructure-as-code (IaC) solutions and automation for repeatable platform provisioning Support platform observability, logging, alerting, and operational readiness Collaborate with application, platform, and data engineering teams to establish event streaming standards and best practices Provide ongoing operational support, maintenance, troubleshooting, and continuous improvement of the platform Contribute to documentation, runbooks, and governance processes
Requirements
Strong experience with Google Cloud Platform (GCP) Hands-on expertise with GCP Pub/Sub administration and operations Experience xbhjioe designing and supporting event-driven or streaming architectures Knowledge of schema registry concepts and data contract management Experience implementing secure, scalable, and highly available infrastructure platforms Strong understanding of messaging reliability, replayability, and observability patterns Experience with Infrastructure as Code tools such as Terraform Familiarity with CI/CD pipelines and infrastructure automation Experience with monitoring and logging tools in cloud-native environments Strong troubleshooting and production support capabilities Excellent stakeholder communication and cross-team collaboration skills Preferred Experience Experience with Kafka or other event streaming technologies Knowledge of API and event governance practices Experience operating shared enterprise platform services Understanding of security, IAM, networking, and compliance considerations within GCP Exposure to DevOps and Site Reliability Engineering (SRE) practices