Site Reliability Engineer (SRE)

BCforward

Pennington, United States of America

yesterday

Role details

Contract type

Temporary contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Experience level

Senior

Compensation

$ 153K

Job location

Pennington, United States of America

Tech stack

Proxy Servers

Application Services

Big Data

Continuous Delivery

Continuous Integration

Quartz (Graphics Layer)

Linux

DNS

Elasticsearch

Perl

Design of User Interfaces

Python

Reliability Engineering

Site Reliability Engineering Practices

Logstash

Ansible

Shell Script

Software Engineering

Load Balancing

Firewalls (Computer Science)

GIT

Kibana

Terraform

Splunk

Dynatrace

Jenkins

Job description

We are seeking a Site Reliability Engineer (SRE) to join our dynamic team supporting the MAPS Quartz platform. The ideal candidate will have strong experience in observability, automation, CI/CD, Linux, networking, and incident response and a proven ability to improve reliability, reduce toil, and enhance operational efficiency at scale., * Design, develop, test, and implement secure, robust, highly available, and scalable solutions for Global Markets applications and platforms.

Build and maintain automated CI/CD pipelines and deployment approaches with Git and Jenkins.
Own reliability across services, lead incident response, and drive issues to permanent resolution.
Define and utilize SRE practices, SLIs, and SLOs to detect and resolve issues proactively.
Create dashboards, visualizations, and reports from large datasets to inform continuous improvement.
Eliminate toil and automate triage to improve operational stability and efficiency.
Collaborate with global teams to identify, analyze, and remediate platform vulnerabilities.
Promote adoption of site reliability engineering best practices across teams and stakeholders.

Requirements

5+ years of experience in SRE, software development, infrastructure engineering, or a related field.
Proven experience operating, monitoring, and maintaining scalable and resilient application services and platforms.
Hands-on with observability and monitoring: OpenTelemetry, ELK (Elasticsearch, Logstash, Kibana), Splunk, and Dynatrace.
Proficiency in Python and Shell scripting; knowledge of Perl is a plus.
Experience implementing CI/CD with Git and Jenkins.
Advanced networking knowledge including firewalls, DNS, load balancing, and proxies.
Advanced Linux expertise, including shell usage and automation with core tools.
Ansible proficiency, including writing playbooks and using core modules.
Excellent interpersonal, organizational, and communication skills.

Preferred Skills:

UI/UX experience to guide best practices for internal tooling used by production support teams.
Infrastructure as Code with Terraform for automated infrastructure deployment.
Experience in large enterprise environments.
Ability to diagnose and conclude complex infrastructure issues with speed and accuracy.
Ability to work in a fast-paced environment and meet deadlines.

Benefits & conditions

Competitive compensation and benefits
Opportunities for growth with global clients
A supportive, inclusive culture that values innovation and people
Exposure to cutting-edge technologies and projects

About the company

BCforward is a leading global IT consulting and workforce solutions firm providing services and support to Fortune 500 and government clients. Founded in 1998, BCforward has grown with our customers needs into a full-service business solutions provider. With delivery centers and offices across North America and India, we take pride in building long-term relationships and delivering excellence through innovation, collaboration, and integrity.