Site Reliability Engineer, Senior - Shared Services

Toyota Financial Services
3 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior

Job location

Tech stack

Microsoft Windows
Amazon Web Services (AWS)
Amazon Web Services (AWS)
Amazon Web Services (AWS)
Bash
Unix
Cloud Computing
Cloud Engineering
Continuous Delivery
Continuous Integration
Linux
DevOps
Distributed Systems
Middleware
Github
Monitoring of Systems
Identity and Access Management
WildFly (JBoss AS)
Python
Network Troubleshooting
Network Architecture
Powershell
Cloud Services
Web Services
Scripting (Bash/Python/Go/Ruby)
Cloud Platform System
DevOps Tools - Open-source
AWS Lambda
Cloudformation
Kubernetes
Infrastructure Automation Frameworks
Information Technology
Cloudwatch
Terraform
Software Version Control
Data Pipelines
Dynatrace
Serverless Computing
Jenkins
Mulesoft
Artifactory

Job description

Reporting to the Manager of SRE Departments, the individual in this role will be focused on operating and automating scalable, resilient Shared services and platforms hosted under AWS infrastructure. You will work with core AWS services such as EKS, Lambda, CloudWAN, ECR, and Systems Manager, driving self-healing automation, observability, and CI/CD pipeline integration.

This role embodies SRE best practices to ensure reliability, performance, and operational excellence of cloud-native platforms supporting our business-critical applications. Furthermore, the engineer will work closely with Platform Development Teams, Production Engineering and Major Incident Management team to address and resolve issues within the production environment.

What you'll be doing

  • Build and Maintain components required to automate and self-Heal components and services running under AWS infrastructure with a focus on: Middleware, JBoss, WebServices, AWS IAM, Lambda functions, EKS (Kubernetes), ECR, and AWS Systems Manager (SSM).
  • Develop automations and maintain infrastructure as code (IaC) using Terraform, ensuring scalable and repeatable deployments.
  • Manage and Support DevOps Tools which includes, GITHub, JFrog Artifactory, Harnes, Jenkins.
  • Manage container orchestration platforms and related cloud-native services.
  • Create and maintain AWS Systems Manager Automation Documents (SSM Documents) for operational workflows.
  • Define and measure SLIs/SLOs, error budgets, and drive reliability improvements.
  • Implement monitoring and observability using Dynatrace with integration to AWS native services for observability e.g. CloudWatch
  • Implement and maintain CI/CD pipelines and deployments with GitHub and Harness.
  • Participate in incident management, on-call rotations, and lead blameless postmortems.
  • Collaborate cross-functionally to embed SRE principles into cloud platform design and operation.
  • Perform RCA and implement solutions required to implement fixes and avoid problems in components running under cloud infrastructure.
  • Continuously seek opportunities to improve cloud infrastructure.
  • Collaborate on designing scalable architectures with Cloud Development teams and implementing CI/CD pipelines.
  • Manage cloud routing, and troubleshoot network issues

Requirements

The Toyota Financial Services Technology Operations Center is looking for a passionate and highly motivated SRE - Shared Services., * Bachelor's degree in information technology or related field.

  • 1+ years of solid understanding of SRE concepts: SLIs, SLOs, error budgets, incident response.

  • 6+ years of experience in SRE, Middleware Support, AWS Cloud Platform, DevOps.

  • Strong hands-on experience in Cloud native services such as Webservices/Platforms, JBoss, EKS, Lambda, ECR, EC2, S3.

  • Strong Understanding of network architecture and protocols within AWS.

  • 1+ years of experience building self-healing systems and automated remediation workflows.

  • 2+ years of experience with DevOps tooling: GitHub (version control), Harness (CI/CD), Dynatrace (observability).

  • 2+ years of experience with infrastructure-as-code tools like CloudFormation, Terraform, and Python modules.

  • 2+ years of experience in Infrastructure services such as Unix, Linux and Windows troubleshooting

  • 2+ years of understanding of IT service management (ITSM) processes.

  • 2+ years of proficiency in scripting skills (e.g., Python, Bash, or PowerShell).

  • 2+ years excellent troubleshooting and problem-solving skills.

  • Ability to work independently and as part of a team.

  • 1+ years of experience with Dynatrace, including:

  • Setting up dashboards, alerts, and SLOs
  • Creating custom metrics and integrations
  • Monitoring distributed systems and data pipelines

Added bonus if you have

  • Certifications like AWS Certified DevOps Engineer, AWS Certified Solutions Architect.
  • Knowledge of integration of tools and technologies like mulesoft, Camel, message streaming services.

Benefits & conditions

During your interview process, our team can fill you in on all the details of our industry-leading benefits and career development opportunities. A few highlights include:

  • A work environment built on teamwork, flexibility and respect
  • Professional growth and development programs to help advance your career, as well as tuition reimbursement
  • Vehicle purchase & lease programs
  • Comprehensive health care and wellness plans for your entire family
  • Toyota 401(k) Savings Plan featuring a company match, as well as an annual retirement contribution from Toyota regardless of whether you contribute

Belonging at Toyota

Our success begins and ends with our people. We embrace all perspectives and value unique human experiences. Respect for all is our North Star. Toyota is proud to have 10+ different Business Partnering Groups across 100 different North American chapter locations that support team members' efforts to dream, do and grow without questioning that they belong.

About the company

Collaborative. Respectful. A place to dream and do. These are just a few words that describe what life is like at Toyota. As one of the world's most admired brands, Toyota is growing and leading the future of mobility through innovative, high-quality solutions designed to enhance lives and delight those we serve. We're looking for talented team members who want to Dream. Do. Grow. with us. An important part of the Toyota family is Toyota Financial Services (TFS), the finance and insurance brand for Toyota and Lexus in North America. While TFS is a separate business entity, it is an essential part of this world-changing company- delivering on Toyota's vision to move people beyond what's possible. At TFS, you will help create best-in-class customer experience in an innovative, collaborative environment.

Apply for this position