FedNow Senior Site Reliability Engineer
Role details
Job location
Tech stack
Job description
-
As a Senior Engineer of the SRE / Production Operations team for FedNow, you will operate the production environment for the program.
-
You will architect, implement, and leverage solution monitoring and tooling to be used for capacity planning, utilization reporting, and scaling.
-
The team uses open source and proprietary software to support Engineering, DevOps, and DevSecOps tools, services, and solutions.
-
CI/CD and IaC Pipeline automation design and development.
-
Resiliency, DR and BCP (including testing)
-
The SRE / Production Operations team is part of the Technical Operations (TechOps) department and has the overall responsibility for the design, management and execution of operations required to support the ongoing technical and delivery needs of the FedNow Program, as well as the transition to production support and operations.
-
This team interfaces with internal stakeholders, customers for planning, delivery, and service management.
-
It owns ongoing ITIL processes, and the implementation and driving of continuous improvement initiatives.
-
You will work closely with Engineers and Architects of the FedNow program in order to maintain seamless automation across the entire platform.
-
Proactively identify suspected gaps in system architecture and design experiments to expose them
-
The ideal candidate is someone who loves building and maintaining reliable and scalable systems, CI/CD tooling, and automating cloud-based highly available, high performing applications., All employees assigned to this position will be subject to FBI fingerprint/ criminal background and Patriot Act/ Office of Foreign Assets Control (OFAC) watch list checks at least once every five years.
For this job, any offer of employment is contingent upon successfully passing a two-phase security screening. The first phase consists of the satisfactory completion of a physical examination (including a drug screening), reference checks, and a security investigation consisting of credit and criminal history checks.
The second phase, which might not be complete until after you begin working at the Reserve Bank, is an additional risk-based security screening determined by the risk rating of the position. Depending upon the sensitivity of the position, this phase may include, and is not limited to, work and residency eligibility verification, and personal interviews with the candidate, references, and prior employers.
All applicants must have resided in the United States for at least three (3) years.
Requirements
-
Strong communication and collaboration skills
-
Extensive knowledge and understanding of working in AWS environments & services
-
EC2, EBS, EKS, RDS, Aurora, S3, Route 53, ELB, IAM, etc.
-
Hashicorp Terraform, Consul, Vault, and Ansible
-
Automation experience preferably using GitLab
-
Experience with scripting languages preferably Python for automated processes
-
Experience working in Linux environment and shell scripting
-
Experience supporting infrastructure for large multi-services applications
-
Experience working with continuous deployment in micro-services architectures
-
Experience working with Docker, Containers, ECR and EKS.
-
Observability - CloudWatch, OpenSearch, Dynatrace, Grafana, Prometheus
-
Familiarity with Fault Injection tooling (i.e. AWS Fault Injection Simulator, Gremlin, ChaosToolkit, Chaos Monkey)
-
Automation mindset to enable consistency and dependability in common actions