Senior Software Engineer/SRE - Application Middleware

Bloomberg's Application Middleware Group

Charing Cross, United Kingdom

2 days ago

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Experience level

Senior

Job location

Charing Cross, United Kingdom

Tech stack

C++

Configuration Management

Data Structures

Database Theory

Distributed Systems

Middleware

Fault Tolerance

Systems Analysis

Python

Multicasting

Network Protocols

Software Engineering

Transmission Control Protocol (TCP)

Reliability of Systems

Kubernetes

Information Technology

Operational Systems

Job description

Design and implement scalable, fault-tolerant systems with a focus on observability, performance, and automation
Collaborate across engineering teams to introduce automated, self-service operational workflows
Conduct deep systems analysis and root cause investigations for complex, distributed systems
Propose and prototype innovative approaches to reliability and risk mitigation
Contribute to design docs, runbooks, and post-incident reviews-clear communication is part of the job

Requirements

A degree in Computer Science, Engineering, Mathematics, or equivalent practical experience
Strong software engineering skills in any high-level language (we mainly use Python and C++)
A deep understanding of software system reliability and risk management-including how to identify potential points of failure and design mitigation strategies.
A good understanding of data structures, algorithms, and system design
Experience navigating and improving large, distributed codebases
An ability to identify system risks and engineer around points of failure
Clear written and verbal communication, including technical documentation and incident analysis

We'd Love to See

We are building a team with a breadth of expertise and value depth in any of the following areas:

Systems Knowledge: A strong grasp of operating systems, fundamental networking protocols (TCP, UDP, multicast), or core database concepts as they apply to modern infrastructure.
Cluster Management: Experience with deployments, staging, and configuration management. Direct experience with Argo and/or Kubernetes or other Pipeline Management Platforms is a significant advantage.
Machine Management at Scale: Experience with capacity planning and automating the lifecycle of large machine fleets.
System Observability and Monitoring: Deep understanding of SLIs/SLOs/SLAs, alerting, and building dashboards for complex systems.
Reliability in Distributed Systems: Knowledge of fault tolerance and the unique challenges of network and node failure in distributed environments.
Mentoring: P roven experience mentoring and growing junior Engineers

About the company

Are you passionate about building high-performance systems that are fast, resilient, and operate at global scale? Join Bloomberg's Application Middleware SRE team, where you'll combine software engineering and systems expertise to keep the backbone of the Bloomberg Terminal running smoothly for hundreds of thousands of users around the world. We're not your typical SRE team. We're embedded in a group that powers real-time connectivity, and we own systems where uptime isn't just important-it's essential to the global financial system. This is your opportunity to engineer resilience at scale, automate critical infrastructure, and shape reliability practices across one of the world's most powerful tech platforms. The Team We're the Site Reliability Engineering team within Bloomberg's Application Middleware group. Our mission: ensure that Bloomberg's core connectivity and messaging layers are resilient, scalable, and fully observable. We own systems that operate at high throughput and low latency, including: * Gateways: Secure, high-performance TCP/SSL entry points to our data centers * HFN & NSTP: A global HTTP CDN and SOCKS5 proxy network delivering fast access from any geography * Playlist Services: Dynamic path configuration systems optimizing user connectivity in real-time * PGM Relays: Infrastructure for reliable multicast data delivery We use automation, observability, and software engineering to detect issues before they impact customers and reduce manual toil wherever we can., Discover what makes Bloomberg unique - watch our for an inside look at our culture, values, and the people behind our success.