Devops Engineer

Doingat Grafana Labs
Charing Cross, United Kingdom
2 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior
Compensation
£ 125K

Job location

Remote
Charing Cross, United Kingdom

Tech stack

Clean Code Principles
Java
.NET
Artificial Intelligence
Cloud Computing
Code Review
Databases
Software Debugging
DevOps
Distributed Systems
Middleware
Python
Open Source Technology
Prometheus
Software Engineering
Rust
Data Ingestion
System Availability
Grafana
Backend
Kubernetes
Network Support
OPUS (Software)
GPT
Go
Programming Languages

Job description

Staff Software Engineer - Grafana Cloud Observability, Kubernetes MonitoringThe OpportunityGrafana Cloud is our composable observability platform that integrates metrics, logs, and traces with Grafana. It allows our customers to leverage the best open source observability software - including Prometheus, Mimir, Loki, and Tempo - without the overhead of installing, maintaining and scaling their own observability stack.

The Observability department is focused on enabling developers to understand the health and performance of their applications and infrastructure in any environment by providing tools to instrument their code, ingest observability data into Grafana Cloud and visualize and explore it.

In this role, you will be part of the team that builds our Cloud Observability stack that allows customers to collect and visualize metrics from various systems and applications. We build and maintain the backend of opinionated applications such as Cloud Provider Observability, Database Observability, and Kubernetes Monitoring. This includes the dashboards, alerts, documentation, and infrastructure while working closely with other teams to ensure seamless experiences. We also strive to incorporate OSS contributions in our work by contributing to projects such as Alloy, Prometheus, OpenTelemetry, and Beyla. The Observability department provides a core building block for customers using Grafana Cloud.

As a company we are remote-first and global, we embrace people of different experiences and backgrounds to build diverse teams where every person brings a unique perspective to the software. We are looking for Engineers that are passionate about communicating with data and providing seamless experiences for our customers to join our growing team! Engineers at Grafana also have the opportunity to contribute to Open Source communities.

What You'll Be DoingAt Grafana Labs, our engineers have a dedicated career path and do not have to become managers to progress in their career. Staff Software Engineers at Grafana have a large amount of experience across multiple areas. They are able to estimate, plan, coordinate and deliver large tasks spanning multiple systems. They actively coach and mentor other team members in their team and are able to identify and resolve issues with technology and product processes.

You will bring your passion for observability and software engineering expertise to help us take our infrastructure monitoring capabilities within Grafana Cloud to the next level. This will include working with our Kubernetes monitoring solution.

Design and implement high-quality, scalable integrations for various infrastructure components, applications, and data ingestion pipelines

Create middleware components and libraries that simplify development and maintenance of observability solutions

When necessary, represent Grafana Labs in open source forums, working groups, and events

Work with product teams, in addition to design and docs, to develop features that align with wider product strategy and customer needs

Lead the technical direction and vision of the team, contributing to strategic discussions and future development of observability solutions

Work with other departments including Sales, Product, and Support teams to deliver a holistic product experience

Take ownership of the services you're running by deploying well tested clean code

Embrace our open-source culture and contribute to other projects that may not directly fall within your team's scope

As we are remote-first and our engineering organization is entirely remote, we provide guidance and meet regularly using video calls, so an independent attitude, good communication skills, and transparency are a must.

We invest heavily in developer productivity. You can use modern AI coding assistants as part of your daily workflow (your choice of tools, within security guidelines), backed by a company-funded usage budget so you can iterate quickly without unnecessary friction. We encourage pragmatic AI-assisted development: faster prototyping, test generation, refactors, documentation, and incident follow-ups-always paired with strong code review and quality standards. You'll also have access to frontier models (e.g., GPT-Codex 5/3, Claude Opus 4.6, Gemini 3 Pro)., Metrea is a defense company focused on translating commercial innovation into solutions for national security. The Lead Dev Ops Engineer will lead the design and hands-on oversight of Dev Ops, Security, and Network support for the UK and Europe. This role involves designing...

Requirements

You have a passion for observability and like to share your knowledge by writing documentation and blog posts.

You love to engage with customers and help them out.

You have excellent communication skills.

You have relevant open source experience, ideally in the observability domain.

You are willing to become an active member of the OpenTelemetry and Prometheus communities.

You're curious and you enjoy learning new programming languages and frameworks, setting up examples, and figuring out how things work.

You have a good understanding of typical production environments. Ideally you have been responsible for operating production services and organizing on-call.

You actively mentor other team members, identifying areas for focus and improvement.

Requirements

Strong 8+ years of experience with at least one programming language - any major language (Python, .NET, Java, Go, Rust, etc) is acceptable

Demonstrated working experience in operating high-scale production systems running on Kubernetes and monitoring it, including on-call participation, incident response, and postmortem practices

Familiarity with observability tooling (e.g. Grafana)

Strong understanding of time-series data, metrics cardinality challenges, and cost/performance tradeoffs/optimizations in observability systems

Experience in a hands-on technical leadership role - setting technical direction, leading project teams, and influencing architectural decisions beyond your immediate team

Deep understanding of distributed systems concepts including scalability, consistency, high availability, and failure modes in large-scale systems

Experience writing clean, maintainable, robust, and performant software

Experience with delivering projects from start to finish in a self-driven manner

Excellent problem-solving and debugging skills

Strong mentoring and leadership skills

Bonus Points For

Experience operating or scaling Prometheus in high-cardinality, multi-tenant environments

Experience working with OpenTelemetry Collector pipelines or similar telemetry ingestion systems

Certified Kubernetes Administrator (CKA)/ Certified Kubernetes Application Developer (CKAD) or any other Kubernetes related certification from CNCF

Experience developing Kubernetes operators, controllers, or custom resources

Strong understanding of metrics collection, visualization, and alerting concepts

Experience contributing to or maintaining open source projects, with evidence of successful pull requests and community collaboration

Experience designing and building observability backends for various systems and applications

About the company

Grafana Labs is a remote-first, open-source powerhouse. There are more than 20M users of Grafana, the open source visualization tool, around the globe, monitoring everything from beehives to climate change in the Alps. The instantly recognizable dashboards have been spotted everywhere from a NASA launch and Minecraft HQ to Wimbledon and the Tour de France. Grafana Labs also helps more than 3,000 companies -- including Bloomberg, JPMorgan Chase, and eBay -- manage their observability strategies with the Grafana LGTM Stack, which can be run fully managed with Grafana Cloud or self-managed with the Grafana Enterprise Stack, both featuring scalable metrics (Grafana Mimir), logs (Grafana Loki), and traces (Grafana Tempo). We're scaling fast and staying true to what makes us different: an open-source legacy, a global collaborative culture, and a passion for meaningful work. Our team thrives in an innovation-driven environment where transparency, autonomy, and trust fuel everything we do. You may not meet every requirement, and that's okay. If this role excites you, we'd love you to raise your hand for what could be a truly career-defining opportunity. This is a remote opportunity and we would be interested in applicants based in Spain, Germany, the UK or Sweden at this time., Compensation & RewardsIn United Kingdom, the compensation range for this role is GBP 103,958 - GBP 124,750. Actual compensation may vary based on level, experience, and skillset as assessed throughout the interview process. All of our roles include Restricted Stock Units (RSUs), giving every team member ownership in Grafana Labs' success. We believe in shared outcomes-RSUs help us stay aligned and invested as we scale globally. Compensation ranges are country specific. If you are applying for this role from a different location than listed above, your recruiter will discuss your specific market's defined pay range & benefits at the beginning of the process. Why You'll Thrive At Grafana Labs 100% Remote, Global Culture - As a remote-only company, we bring together talent from around the world, united by a culture of collaboration and shared purpose. Scaling Organization - Tackle meaningful work in a high-growth, ever-evolving environment. Transparent Communication - Expect open decision-making and regular company-wide updates. Innovation-Driven - Autonomy and support to ship great work and try new things. Open Source Roots - Built on community-driven values that shape how we work. Empowered Teams - High trust, low ego culture that values outcomes over optics. Career Growth Pathways - Defined opportunities to grow and develop your career. Approachable Leadership - Transparent execs who are involved, visible, and human. Passionate People - Join a team of smart, supportive folks who care deeply about what we do. In-Person onboarding - We want you to thrive from day 1 with your fellow new 'Grafanistas' to learn all about what we do and how we do it. Balance is Key - We operate a global annual leave policy of 30 days per annum. 3 days of your annual leave entitlement are reserved for Grafana Shutdown Days to allow the team to really disconnect. *We will comply with local legislation where applicable. Equal Opportunity Employer:We will recruit, train, compensate and promote regardless of race, religion, color, national origin, gender, disability, age, veteran status, and all the other fascinating characteristics that make us different and unique. We believe that equality and diversity builds a strong organization and we're working hard to make sure that's the foundation of our organization as we grow. Grafana Labs may utilize AI tools in its recruitment process to assist in matching information provided in CVs to job postings. The recruitment team will continue to review inbound CVs manually to identify alignment with current openings., Grafana Labs is a remote-first, open-source powerhouse. There are more than 20M users of Grafana, the open source visualization tool, around the globe, monitoring everything from beehives to climate change in the Alps. The instantly recognizable dashboards have been spotted..., Join to apply for the Senior Devops Engineer role at SS&C TechnologiesJoin to apply for the Senior Devops Engineer role at SS&C TechnologiesAs a leading financial services and healthcare technology company based on revenue, SS&C is headquartered in Windsor, Connecticut, and..., Clio is the global leader in legal AI technology, empowering legal professionals and law firms of every size to work smarter, faster, and more securely. Manager, Systems EngineeringThis is a remote role based in the United Kingdom, with occasional in-person collaboration..., OverviewCreate the future of travel with us Whether it's to visit the people closest to us, starting an exciting adventure, or a career-defining business trip, travel is an essential part of our lives. Yet we've all experienced the aches and pains of getting to our..., Grafana Labs is a remote-first, open-source powerhouse. There are more than 20M users of Grafana, the open-source visualization tool, around the globe, monitoring everything from beehives to climate change in the Alps. The instantly recognizable dashboards have been spotted...

Apply for this position