Senior Software Engineer/SRE - Application Middleware
Bloomberg's Application Middleware Group
Charing Cross, United Kingdom
2 days ago
Role details
Contract type
Permanent contract Employment type
Full-time (> 32 hours) Working hours
Regular working hours Languages
English Experience level
SeniorJob location
Charing Cross, United Kingdom
Tech stack
C++
Configuration Management
Data Structures
Database Theory
Distributed Systems
Middleware
Fault Tolerance
Systems Analysis
Python
Multicasting
Network Protocols
Software Engineering
Transmission Control Protocol (TCP)
Reliability of Systems
Kubernetes
Information Technology
Operational Systems
Job description
- Design and implement scalable, fault-tolerant systems with a focus on observability, performance, and automation
- Collaborate across engineering teams to introduce automated, self-service operational workflows
- Conduct deep systems analysis and root cause investigations for complex, distributed systems
- Propose and prototype innovative approaches to reliability and risk mitigation
- Contribute to design docs, runbooks, and post-incident reviews-clear communication is part of the job
Requirements
- A degree in Computer Science, Engineering, Mathematics, or equivalent practical experience
- Strong software engineering skills in any high-level language (we mainly use Python and C++)
- A deep understanding of software system reliability and risk management-including how to identify potential points of failure and design mitigation strategies.
- A good understanding of data structures, algorithms, and system design
- Experience navigating and improving large, distributed codebases
- An ability to identify system risks and engineer around points of failure
- Clear written and verbal communication, including technical documentation and incident analysis
We'd Love to See
We are building a team with a breadth of expertise and value depth in any of the following areas:
- Systems Knowledge: A strong grasp of operating systems, fundamental networking protocols (TCP, UDP, multicast), or core database concepts as they apply to modern infrastructure.
- Cluster Management: Experience with deployments, staging, and configuration management. Direct experience with Argo and/or Kubernetes or other Pipeline Management Platforms is a significant advantage.
- Machine Management at Scale: Experience with capacity planning and automating the lifecycle of large machine fleets.
- System Observability and Monitoring: Deep understanding of SLIs/SLOs/SLAs, alerting, and building dashboards for complex systems.
- Reliability in Distributed Systems: Knowledge of fault tolerance and the unique challenges of network and node failure in distributed environments.
- Mentoring: P roven experience mentoring and growing junior Engineers
About the company
Are you passionate about building high-performance systems that are fast, resilient, and operate at global scale? Join Bloomberg's Application Middleware SRE team, where you'll combine software engineering and systems expertise to keep the backbone of the Bloomberg Terminal running smoothly for hundreds of thousands of users around the world.
We're not your typical SRE team. We're embedded in a group that powers real-time connectivity, and we own systems where uptime isn't just important-it's essential to the global financial system. This is your opportunity to engineer resilience at scale, automate critical infrastructure, and shape reliability practices across one of the world's most powerful tech platforms.
The Team
We're the Site Reliability Engineering team within Bloomberg's Application Middleware group. Our mission: ensure that Bloomberg's core connectivity and messaging layers are resilient, scalable, and fully observable.
We own systems that operate at high throughput and low latency, including:
* Gateways: Secure, high-performance TCP/SSL entry points to our data centers
* HFN & NSTP: A global HTTP CDN and SOCKS5 proxy network delivering fast access from any geography
* Playlist Services: Dynamic path configuration systems optimizing user connectivity in real-time
* PGM Relays: Infrastructure for reliable multicast data delivery
We use automation, observability, and software engineering to detect issues before they impact customers and reduce manual toil wherever we can., Discover what makes Bloomberg unique - watch our for an inside look at our culture, values, and the people behind our success.