Observability Sre / Platform Engineer
Role details
Job location
Tech stack
Job description
Observability SRE / Platform Engineer / Production Engineer sought to join an elite Multi-Strategy Quant Hedge Fund with $40BN+ in Assets Under Management.
Our client is one of the world's top Hedge Funds who are especially renowned for innovation and investment in Data & Technology. They are now looking to recruit a Low Latency Trading & Observability SRE / Platform Engineer / Production Engineer into their core Production Engineering team, who are accountable for the reliability, operability and performance of the firm's trading-critical systems, in an environment where availability, correctness and latency directly impact outcomes.
The successful hire will own the reliability of business-critical systems, from observability design through to incident resolution and systemic improvement; lead high-severity incident management; and reduce toil with software engineering, primarily in Python but also with Golang, TypeScript, SQL and/or PowerShell. The role requires someone who can evidence outstanding problem-solving and the ability to solve problems on their own initiative.
Requirements
- Debugging distributed systems: operating, improving and scaling complex systems in high-availability environments.
- SRE fundamentals: SLO/SLI thinking, observability, incident leadership, and a bias for systemic platform fixes.
- Strong software engineering skills: high proficiency in at least one modern programming language (Python preferred).
- Modern open-source observability knowledge: Experience within the ecosystem (e.G. OpenTelemetry, LGTM stack, Prometheus, Grafana, Loki).
- Unsiloed cross-platform operator: Confidence taming complex technical landscapes spanning multiple languages, platforms and data systems i.E. Linux, Cloud, Kubernetes, Windows.
- Strong communication skills: able to explain technical issues and trade-offs to non experts under time pressure.
- Relationship-driven mindset with high accountability: builds trust, aligns stakeholders, and takes end-to-end ownership for results.