Senior Site Reliability Engineer
Role details
Job location
Tech stack
Job description
We're looking for a Senior Site Reliability Engineer who's passionate about designing, maintaining, and scaling infrastructure that powers billions of ad impressions and millions of requests per second. You'll play a key role in ensuring our systems are reliable, resilient, and ready for the future.
Your responsibilities will include:
- Orchestrate our Cloud: Manage and enhance a large-scale AWS infrastructure spanning thousands of instances across 6 regions.
- Champion Performance: Monitor, analyze, and optimize system performance to ensure our platforms handle over 3 million requests per second seamlessly.
- Drive Reliability: Respond swiftly to incidents, resolve complex issues, and contribute to postmortems to continuously improve system resilience.
- Scale: Design and refine our scaling procedures to support growth while ensuring stability and efficiency.
- Automate Everything: Build automation tools and Infrastructure-as-Code solutions to minimize manual effort and maximize reliability.
- Elevate Quality: Uphold strong standards for system configurations and code, ensuring a robust and secure infrastructure.
- Collaborate for Impact: Partner with Backend Engineers, Data Engineers, and Product Managers to embed reliability and performance into every layer of our stack.
- Champion CI/CD: Promote best practices in continuous integration and deployment for consistent, rapid, and reliable delivery.
Requirements
Do you have experience in Scalability?, Do you have a Bachelor's degree?, * Have deep experience with AWS ecosystems, and know how to manage large-scale, multi-region infrastructures with confidence and precision;
- Are skilled at monitoring, logging, and alerting, ensuring system health and fast incident resolution;
- Can write and maintain clean, reliable, and testable code in at least two programming languages, ideally with a focus on automation and Infrastructure as Code;
- Have a strong grasp of scalability challenges and a proven ability to design and operate systems that perform flawlessly under high traffic;
- Are comfortable with low-level Linux functionalities and understand OS theory of operations to optimize system performance;
- Have experience maintaining or operating production systems and take ownership of reliability and uptime;
- Know your way around at least one public cloud platform (AWS, Azure, or GCP);
- Communicate clearly, thrive in collaborative environments, and can articulate your technical decisions with confidence;
- Remain calm and analytical when troubleshooting complex issues - especially under pressure;
- Care deeply about both technical excellence and business impact, ensuring every solution drives real value for Adikteev and its customers;
- Are fluent in English and comfortable working in a global team.
And if you bring any of the following, that's even better:
- Experience with large-scale distributed systems or service-oriented architectures;
- Familiarity with caching systems (in-memory, distributed);
- Hands-on experience with high-performance or low-latency systems;
- Understanding of stream processing architectures;
- Experience with Kubernetes, Prometheus, or ArgoCD;
- A natural tendency to design for scalability, automate wherever possible, and continuously improve reliability.
Please note that, to comply with employment regulations, applicants are required to maintain residency and be legally authorized to work in France or the European Union in order to be considered for this position.
Benefits & conditions
-
Competitive Salary ;
-
Performance-based Quarterly bonus with transparent KPIs;
-
Longevity bonus every 2 years ;
-
4 1/2 days work week (Fridays afternoon off) ;
-
"RTT" ;
-
Mental Health support ;
-
Flexible remote working policy ;
-
Regular team-life event / activity ;
-
Inclusive parental leave Policy.
-
Most benefits are available to all employees. For remote team members based outside of France, benefits will be aligned with local laws and practices in your country of residence, ensuring support that's both relevant and compliant wherever you are.