Senior Site Reliability Engineer
NICE
Southampton, United Kingdom
4 days ago
Role details
Contract type
Permanent contract Employment type
Full-time (> 32 hours) Working hours
Regular working hours Languages
English Experience level
SeniorJob location
Remote
Southampton, United Kingdom
Tech stack
Java
Amazon Web Services (AWS)
Amazon Web Services (AWS)
Software Applications
Systems Engineering
Bash
C Sharp (Programming Language)
Continuous Integration
DevOps
Distributed Systems
Amazon DynamoDB
Monitoring of Systems
Python
Performance Tuning
Powershell
Reliability Engineering
Ansible
Prometheus
Software Engineering
Software Systems
Datadog
CircleCI
Scripting (Bash/Python/Go/Ruby)
Google Cloud Platform
Cloud Platform System
Grafana
Cloudformation
Containerization
Gitlab-ci
Kubernetes
Infrastructure Automation Frameworks
Cloudwatch
Puppet
Rundeck
Terraform
Splunk
Docker
Pagerduty
ELK
Jenkins
Go
Microservices
Job description
At NiCE, we don't limit our challenges. We challenge our limits. Always. We're ambitious. We're game changers. And we play to win. We set the highest standards and execute beyond them. And if you're like us, we can offer you the ultimate career opportunity that will light a fire within you.
So, what's the role all about?
- Run the production environment by monitoring availability and taking a holistic view of system health
- Build software and systems to manage platform infrastructure and applications
- Improve reliability, quality, and time-to-market of our suite of software solutions
- Measure and optimize system performance, with an eye toward pushing our capabilities forward, getting ahead of customer needs, and innovating to continually improve
- Provide primary operational support and engineering for multiple large distributed software applications
How will you make an impact?
- Gather and analyze metrics from both operating systems and applications to assist in performance tuning and fault finding
- Partner with development teams to improve services through rigorous testing and release procedures
- Participate in system design consulting, platform management, and capacity planning
- Create sustainable systems and services through automation and uplifts
- Balance feature development speed and reliability with well-defined service level objectives, At NICE, we work according to the NICE-FLEX hybrid model, which enables maximum flexibility: 2 days working from the office and 3 days of remote work, each week. Naturally, office days focus on face-to-face meetings, where teamwork and collaborative thinking generate innovation, new ideas, and a vibrant, interactive atmosphere.
Requirements
- 3-6 years of working experience in a similar role, with a focus on systems engineering, automation, and reliability.
- Proficiency in at least one programming language (e.g., Python, Go, Java, C#) and experience with scripting languages (e.g., Bash, PowerShell).
- Deep understanding of cloud computing platforms (e.g., AWS), the working and reliability constraints of some of the prominent services (e.g., EC2, ECS, Lambda, DynamoDB etc)
- Experience with infrastructure as code tools such as CloudFormation, Terraform.
- Deep understanding of CI/CD concepts and experience with CI/CD tools such as Jenkins, GitLab CI/CD, or CircleCI.
- Strong knowledge of containerization technologies (e.g., Docker, Kubernetes) and microservices architecture.
- Experience with monitoring and observability tools (e.g., Prometheus, Grafana, ELK stack, Cloudwatch).
- Excellent problem-solving skills and the ability to troubleshoot complex issues in distributed systems.
- Experience of Incident management and blameless postmortems that includes driving the incident response efforts during outages and other critical incidents, resolution, and communication in a cross-functional team setup.
You will have an advantage if you also have:
- Handson experience of working with large Kubernetes Cluster. Certification will be an added plus.
- Working experience of Grafana Observability Suite (Loki, Mimir, Tempo).
- Administration and/or development experience of standard monitoring and automation tools such as Splunk, Datadog, Pagerduty Rundeck.
- Familiarity with configuration management tools like Ansible, Puppet, or Chef.
- Certifications such as AWS Certified DevOps Engineer, Google Cloud Professional DevOps Engineer, or equivalent., * Strong communication skills and the ability to collaborate effectively with cross-functional teams.
- Team player - ability to work well in a close team environment.
- Fast learner with ability to educate her/himself on relevant technologies
- Ability to multitask and prioritize work
- Ability to remain focused and calm under pressure
About the company
NICE Ltd. (NASDAQ: NICE) software products are used by 25,000+ global businesses, including 85 of the Fortune 100 corporations, to deliver extraordinary customer experiences, fight financial crime and ensure public safety. Every day, NiCE software manages more than 120 million customer interactions and monitors 3+ billion financial transactions.
Known as an innovation powerhouse that excels in AI, cloud and digital, NiCE is consistently recognized as the market leader in its domains, with over 8,500 employees across 30+ countries.