Site Reliability Engineer (SRE)
BCforward
Pennington, United States of America
yesterday
Role details
Contract type
Temporary contract Employment type
Full-time (> 32 hours) Working hours
Regular working hours Languages
English Experience level
Senior Compensation
$ 153KJob location
Pennington, United States of America
Tech stack
Proxy Servers
Application Services
Big Data
Continuous Delivery
Continuous Integration
Quartz (Graphics Layer)
Linux
DNS
Elasticsearch
Perl
Design of User Interfaces
Python
Reliability Engineering
Site Reliability Engineering Practices
Logstash
Ansible
Shell Script
Software Engineering
Load Balancing
Firewalls (Computer Science)
GIT
Kibana
Terraform
Splunk
Dynatrace
Jenkins
Job description
We are seeking a Site Reliability Engineer (SRE) to join our dynamic team supporting the MAPS Quartz platform. The ideal candidate will have strong experience in observability, automation, CI/CD, Linux, networking, and incident response and a proven ability to improve reliability, reduce toil, and enhance operational efficiency at scale., * Design, develop, test, and implement secure, robust, highly available, and scalable solutions for Global Markets applications and platforms.
- Build and maintain automated CI/CD pipelines and deployment approaches with Git and Jenkins.
- Own reliability across services, lead incident response, and drive issues to permanent resolution.
- Define and utilize SRE practices, SLIs, and SLOs to detect and resolve issues proactively.
- Create dashboards, visualizations, and reports from large datasets to inform continuous improvement.
- Eliminate toil and automate triage to improve operational stability and efficiency.
- Collaborate with global teams to identify, analyze, and remediate platform vulnerabilities.
- Promote adoption of site reliability engineering best practices across teams and stakeholders.
Requirements
- 5+ years of experience in SRE, software development, infrastructure engineering, or a related field.
- Proven experience operating, monitoring, and maintaining scalable and resilient application services and platforms.
- Hands-on with observability and monitoring: OpenTelemetry, ELK (Elasticsearch, Logstash, Kibana), Splunk, and Dynatrace.
- Proficiency in Python and Shell scripting; knowledge of Perl is a plus.
- Experience implementing CI/CD with Git and Jenkins.
- Advanced networking knowledge including firewalls, DNS, load balancing, and proxies.
- Advanced Linux expertise, including shell usage and automation with core tools.
- Ansible proficiency, including writing playbooks and using core modules.
- Excellent interpersonal, organizational, and communication skills.
Preferred Skills:
- UI/UX experience to guide best practices for internal tooling used by production support teams.
- Infrastructure as Code with Terraform for automated infrastructure deployment.
- Experience in large enterprise environments.
- Ability to diagnose and conclude complex infrastructure issues with speed and accuracy.
- Ability to work in a fast-paced environment and meet deadlines.
Benefits & conditions
- Competitive compensation and benefits
- Opportunities for growth with global clients
- A supportive, inclusive culture that values innovation and people
- Exposure to cutting-edge technologies and projects
About the company
BCforward is a leading global IT consulting and workforce solutions firm providing services and support to Fortune 500 and government clients. Founded in 1998, BCforward has grown with our customers needs into a full-service business solutions provider. With delivery centers and offices across North America and India, we take pride in building long-term relationships and delivering excellence through innovation, collaboration, and integrity.