Sr. DevOps Engineer - AI and Site Reliability Engineering

Teradata
Topeka, United States of America
yesterday

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior

Job location

Topeka, United States of America

Tech stack

Artificial Intelligence
Amazon Web Services (AWS)
Azure
Cloud Computing
DevOps
Network Layer
Machine Learning
Reliability Engineering
Cloud Services
Software Deployment
Software Systems
Teradata
Datadog
Google Cloud Platform
Large Language Models
Multi-Agent Systems
Reliability of Systems
Information Technology

Job description

  1. Working on a team of professionals, you will design, implement, test, deploy, administer, and continually improve software solutions to ensure system reliability and availability, mitigate operational risks, track system health, and improve mean-time-to-discover and mean-time-to-respond for operational issues.

  2. You will help lead chaos engineering efforts in a production-alike environment, exposing systems to simulations of real-world turbulence with the objective of identifying and quantifying operational weaknesses and developing remediation strategies.

  3. You will leverage modern AI technologies, including large language models, machine learning, and agentic systems, both to increase the operational efficiency of the team and to measure and improve the reliability, scalability, observability, supportability, and performance of Teradata software.

  4. You will become a subject-matter expert in the production deployment and upgrade of Teradata software and the full software stack, from the network layer all the way to the observability tooling, that it relies on.

Who You'll Work With

  1. You'll work on a globally-distributed team of other devops professionals, with engineers focused on site reliability engineering and observability.

  2. You'll work closely with product engineering and cloud operations personnel to understand operational requirements and identify and remediate operational deficits.

  3. You'll work with security and compliance teams to help provide evidence necessary to meet Teradata's compliance obligations.

  4. You'll report to a Sr. Manager, Site Reliability Engineering.

Requirements

  1. Bachelor's degree or equivalent in computer science or a related field, master's degree or equivalent preferred.

  2. 4+ years of industry experience.

  3. Experience with at least one major cloud service provider (AWS, Azure, and/or Google Cloud), preferably all three. CSP developer or architect certificatio

About the company

At Teradata, we believe that people thrive when empowered with better information. Teradata Autonomous Knowledge Platform activates enterprise intelligence by unifying data, knowledge and business context to achieve tangible outcomes. With Teradata, organizations can provide agents with full context for impact when it matters. Our solution lets businesses connect and scale on premises, in the cloud, or through a hybrid approach. Teradata delivers real business value with AI.

Apply for this position