Site Reliability Engineer
Role details
Job location
Tech stack
Job description
Our Site Reliability Engineering (SRE) team is at the center of strategic firm initiatives and contributes to defining and building the observability and stability patterns supporting product roadmaps. The role is expected to establish and follow architectural best practices and build scale and efficiencies that enable future business growth.
What You'll Do In this role, you will help with the design, observability, and stability of new technology platforms supporting Edward Jones's Digital Branch Experience and partner closely with application, platform, and security teams. Day to day, you will apply software engineering principles to infrastructure and operations, focusing on automation, observability, resilience, and continuous improvement across cloud environments and on-premise environments.
-
Design observability, monitoring, and alerting solutions at firm level
-
Focus on continuous improvement to drive a seamless, modernized associate/user experience.
-
Participate in the analysis, design, development, testing, and implementation activities within the areas of responsibility.
-
Assist, mentor, and train other team members on observability, alerting, and logging practices.
-
Drive adoption of architectural patterns, design policies, observability and performance across initiatives.
-
Define and manage service level indicators (SLIs), service level objectives (SLOs), and error budgets to balance reliability and innovation.
-
Automate infrastructure provisioning and operations using Infrastructure as Code (IaC)
-
Lead and participate in incident response, root cause analysis (RCA), and post-incident reviews
-
Improve reliability through resilience testing, capacity planning, and performance tuning
-
Reduce operational toil through automation and self-healing solutions
-
Contribute to runbooks, documentation, and operational standards
-
Internally, the position title is Technical Architect
Requirements
- Bachelor's degree in Computer Science, Engineering, or equivalent practical experience
- 7+ years of experience in SRE, DevOps, infrastructure, or production operations roles
- Experience with monitoring and observability tools like Dynatrace, Prometheus, Grafana, CloudWatch, Azure Monitor, Datadog, New Relic, etc.
- Hands-on experience with cloud platforms
- Experience with containers and orchestration (Docker, Kubernetes: EKS, AKS, or GKE)
- Production experience managing Linux-based systems
- Proficiency in at least one programming or scripting language (Python, Go, Java, Bash, or similar)
- Experience with Infrastructure as Code tools (Terraform, CloudFormation, ARM/Bicep)
- Availability to participate in an on-call rotation schedule
What Could Set You Apart:
- Advanced understanding of Agile frameworks and scaled delivery models (e.g., Scrum, Kanban, SAFe), with the ability to influence stakeholders, optimize team execution, and apply Agile principles at the program or portfolio level.
- Proven experience leading or operating within mature Agile/Scrum delivery environments, with accountability for delivery outcomes, cross-functional collaboration, and continuous improvement.
Benefits & conditions
At Edward Jones, we are building a place where everyone feels like they belong. We're proud of our associates' contributions to the firm and the recognitions we have received.
Check out our U.S. awards and accolades: Insights & Information Blog Postings about Edward Jones
Check out our Canadian awards and accolades: Insights & Information Blog Postings about Edward Jones