Software Developer - Research Technology
Role details
Job location
Tech stack
Job description
XTX is a research-driven organisation, built and led by passionate mathematicians and computer scientists. The Research Technology team lies at the heart of the company - the CPU and GPU clusters are understandably regarded as some of the firm's core assets, and driving their development forward is a primary focus of XTX. The research performed on the cluster is fundamental to the firm's success.
What we are doing:
- The cluster spans multiple data-centres and accommodates multiple tenants, providing CPU and GPU for executing tasks, running services, housing LLMs etc. We are actively expanding the features and UX that we provide to quants and other teams within XTX.
- We have built and open-sourced our own exa-scale filesystem, designed to handle billions of directories, a trillion files and a million clients spanning multiple datacentres, whilst offering complete resiliency. You can read more about it here. Current projects include proving full NFS access, optimising storage services, io feature expansion (e.g. greater POSIX compatibility, creating an iSCSI target), bandwidth optimisation etc.
- We have several low-level technical projects including file system optimisation, storage performance, file compression, network segregation, as well as network, GPU and system performance management.
- Higher level projects include service failover, system and configuration deployment, hardware failure management., The Research Technology team at XTX Markets is responsible for all aspects of the firm's HPC cluster as well as supporting the work of the quantitative researchers that use it. Although the team's scope encompasses all aspects of infrastructure and software design, implementation and maintenance, this role is primarily focused on software development.
Requirements
Do you have experience in Rust (programming language)?, * Successful candidates will be self-motivated and self-starters. They will constructively engage with the team of researchers and look for novel and scalable ways of solving problems, improving resiliency and enhancing the scalability of the system.
- They will have a strong awareness of risk - not afraid to promote radical change and alternative ways of thinking, but also able to deliver solutions in a pragmatic and secure manner reducing the potential for operational failure.
- They must be prepared to work in a fast-moving environment and manage the challenges of maintaining a complex live system 24/7 whilst delivering change at short notice or to tight deadlines. Time-to-market is key., * A solid grounding in academic CS fundamentals (algorithms and data structures).
- Proficient in at least one statically typed language; development with be using Golang and Rust though experience in these is not a pre-requisite. Scripting is mainly in Python.
- Approximately 5-10 years' experience designing and building large-scale distributed systems; with the ability to develop highly scalable solutions to problems.
- Strong problem solving and analytical skills.
- Familiarity with the Linux operating system; able to engage in diagnosing issues, specifically those associated with performance and scalability.
- Ability to multi-task, working on multiple projects at once and prioritise appropriately across them.
- Be highly self-motivated and able to work independently without supervision.
- Understanding of one or more machine learning frameworks and compute offload devices, like GPUs, is an advantage.
Benefits & conditions
- Onsite gym, sauna, and fitness classes at no charge.
- Extensive medical benefits including an on-site doctor and therapist at no charge.
- Breakfast and lunch provided daily.
- Various supports for caregivers, including emergency dependent care
- Beautiful Kings Cross office: https://vimeo.com/257888726
- 25 days paid holiday per year + statutory holiday and paid sick days. We currently operate 4 days a week in-office, 1 from home.