Staff Backend Engineer - Grafana Enterprise | US | Remote
Role details
Job location
Tech stack
Job description
- Earning the trust of our large-scale operator customers to further Grafana's "big tent" philosophy of data accessibility and to meet clear business objectives
- Designing and leading the development of backend services, distributed systems, and enterprise features at scale
- Driving continuous improvement of our engineering culture through words and actions
- Driving projects from initial ideation through the development lifecycle to production
- Contributing to the scalability, reliability, security, and multi-tenancy of the Grafana platform trusted by some of the world's largest operators
- Owning the operational health of our platform by participating in weekday 12h x 5d and separate weekend 24h x 2d on-call rotations. (Yes, we prioritize ops load reduction.)
- Hiring and developing the best engineers to build the future of Grafana
- Developing your skills as a thought leader to drive continuous improvement of engineering and operational practices across Grafana Labs
- As we are remote-first, we provide guidance and meet regularly using video calls, so strong teamwork and excellent written and interpersonal communication skills are a must.
We invest heavily in developer productivity. You can use modern AI coding assistants as part of your daily workflow (your choice of tools, within security guidelines), backed by a company-funded usage budget so you can iterate quickly without unnecessary friction.
We encourage pragmatic AI-assisted development: faster prototyping, test generation, refactors, documentation, and incident follow-ups-always paired with strong code review and quality standards.
You'll also have access to frontier models (e.g., GPT-Codex 5/3, Claude Opus 4.7, Gemini 3 Pro).
What Makes You a Great Fit:
- You work well as a communicative member of a team of engineering professionals.
- You earn trust by saying what you mean and doing what you say.
- You are customer focused and especially attuned to the needs of large-scale operators who rely on Grafana as critical infrastructure. You start with their needs and work backwards.
- You insist on the highest standards and work to develop the skills and knowledge of your fellow team members.
- You take on complex distributed systems challenges, break them down into digestible problems, and leverage your team and organization to deliver.
- You design modular solutions, deliver minimum loveable products, gather data and feedback, and then progress iteratively.
What will you be doing? (Role specifics)
As a Staff Backend Engineer, you will design and build the backend systems powering Grafana Enterprise - the platform trusted by the world's largest operators to run their observability and software delivery infrastructure.
- Hiring and developing the best engineers we can to deliver the future of Observability.
- Architecting and implementing distributed backend services in Go, with a focus on correctness, observability, and performance at scale
- Designing APIs and service contracts used by thousands of enterprise operators and cloud service providers
- Collaborating with Product and UX to shape features and partnering with frontend engineers to ship complete, end-to-end solutions
- Driving scalability and reliability improvements that matter to large-scale operators running Grafana in regulated, high-availability environments
- Engaging directly with large enterprise customers and cloud service providers to understand their requirements and translate them into robust engineering solutions
- Advocating for our customers at every stage of the development lifecycle
Requirements
Do you have experience in Writing skills?, * Deep professional experience writing production services, from ideation through to production operations at scale
- Strong distributed systems fundamentals: replication, consistency models, partitioning, fault tolerance, and the trade-offs that come with operating at scale
- Demonstrated experience designing and operating systems for large-scale, high-traffic, high-availability, or multi-tenant environments, ideally in the context of infrastructure, observability, or software delivery platforms
- Professional experience building and consuming gRPC/protobuf APIs and designing clean service contracts across service boundaries
- Strong database skills, such as PostgreSQL and/or MySQL; including schema design, query optimisation, and schema migrations at scale
- Experience with large-scale CI/CD systems and build tooling, designing, operating, or integrating with continuous delivery pipelines that serve large engineering organisations or external operators at scale
- Comfort working with Kubernetes and containerised deployment environments, including patterns for operating stateful workloads and multi-tenant clusters
- Experience with observability tooling: OpenTelemetry, Prometheus metrics, structured logging, and distributed tracing
- Familiarity with dependency injection patterns (e.g., Google Wire) and clean, testable service architecture
What are your "nice to have" technical competencies?
- Experience with TypeScript and React for contributing to frontend features and collaborating closely with frontend engineers
- Experience with Grafana's LGTM+ observability stack (Loki, Mimir, Tempo, Pyroscope, Alloy)
- Prior experience at or building for large-scale cloud service providers, IaaS providers, or global enterprises with demanding SLA requirements
- Experience designing or operating large-scale build infrastructure artifact registries, distributed build caches, hermetic build systems (e.g., Bazel), or developer platform tooling