SITE RELIABILITY ENGINEER
THE JUDGE GROUP, INC.
Berkeley Heights, United States of America
4 days ago
Role details
Contract type
Temporary to permanent Employment type
Full-time (> 32 hours) Working hours
Regular working hours Languages
English Experience level
Intermediate Compensation
$ 162KJob location
Berkeley Heights, United States of America
Tech stack
Java
Monitoring of Systems
Python
Cisco Nexus Switches
Powershell
Reliability Engineering
Ansible
SonarQube
Grafana
Gitlab
Information Technology
Terraform
Splunk
Dynatrace
Job description
- Design, develop, and implement automation solutions to streamline operational tasks and system health checks.
- Monitor and maintain production systems using observability platforms such as Dynatrace, Splunk, or similar tools.
- Analyze system performance and reliability metrics to identify gaps and drive process improvements.
- Partner with engineering and operations teams to provide system design guidance, platform support, and capacity planning.
- Develop and maintain comprehensive documentation, including standard operating procedures (SOPs), configuration details, and infrastructure diagrams.
Requirements
- Bachelor's degree in Computer Science, Engineering, or a related technical field, or equivalent practical experience.
- 5+ years of experience in Site Reliability Engineering, preferably within a product-based or financial technology environment.
- 4+ years of experience in automation and scripting using languages such as Python, Java, Ansible, or PowerShell.
- 4+ years of experience with monitoring and observability tools such as Dynatrace, Splunk, Grafana, or similar platforms.
Preferred Qualifications
- Experience with CI/CD pipelines and tools such as GitLab, Harness, Terraform, Nexus, or SonarQube.
- Strong analytical and problem-solving skills, with the ability to perform root cause analysis and implement proactive solutions.
- Excellent communication and collaboration skills, with the ability to work effectively across cross-functional teams