Sr Devops Engineer

SPERIA DYNAMICS LLC
Atlanta, United States of America
yesterday

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior

Job location

Atlanta, United States of America

Tech stack

Amazon Web Services (AWS)
Automation of Tests
Azure
Bash
Cloud Computing
Cloud Computing Security
Configuration Management
Continuous Integration
DevOps
Fault Tolerance
Python
Linux System Administration
Regression Testing
Reliability Engineering
Systems Architecture
Scripting (Bash/Python/Go/Ruby)
System Availability
Delivery Pipeline
Grafana
Mttr
Reliability of Systems
Infrastructure as Code (IaC)
Build Management
Containerization
Kubernetes
Deployment Automation
Performance Monitor
Terraform
Data Pipelines
Docker

Job description

The Lead Engineer, DevOps plays an integral role in implementing and executing cloud practices for build management, product release and operation processes. The role is responsible for managing and automating the build and deployment process and regression testing, building tools and monitoring used in product implementations. The Lead Engineer, DevOps will help in defining, maintaining procedures and tools that are used to deliver releases in a repeatable and scalable manner. This role involves significant collaboration with stakeholders across IT and partners.

Roles & Responsibilities

Design, implement, and maintain automated deployment and configuration management systems. Develop and maintain Infrastructure as Code (IaC) scripts for provisioning infrastructure. Continuously improve deployment processes to enhance efficiency, reliability, and scalability. Implement and manage CI/CD pipelines to automate the software delivery lifecycle. Ensure integration of automated testing into all pipeline stages. Maintain and optimize build agents/runners to support development workflows. Implement and maintain monitoring, alerting, and logging systems. Collaborate with teams to define and track Service Level Objectives (SLOs) and Service Level Indicators (SLIs). Contribute to system reliability through proactive monitoring and alerting. Conduct blameless post-incident reviews. Contribute to defining and managing error budgets. Develop and maintain runbooks and operational procedures. Collaborate with development teams to design systems for high availability, scalability, and performance. Perform capacity planning and system performance analysis. Identify and address system bottlenecks. Collaborate with security teams to implement best practices. Ensure systems comply with industry standards and security requirements. Create and maintain documentation for infrastructure, deployment processes, and workflows. Contribute to internal knowledge bases and operational documentation.

Requirements

Do you have experience in Tooling?, 4+ years of experience in DevOps, Site Reliability Engineering (SRE), or Platform Engineering roles. Strong experience with cloud platforms (Azure or AWS). Experience with Linux environments.

Hands-on experience with: Infrastructure as Code (Terraform or equivalent) CI/CD tools and pipeline development Experience designing and supporting highly available, fault-tolerant systems. Proficiency in Python, Bash, or similar scripting languages.

Working knowledge of: System architecture patterns Observability tools Cloud security best practices Experience with containerization (Docker) and orchestration (Kubernetes or similar). Demonstrated ability to learn and adopt new tools and technologies.

Key Performance Indicators (KPIs)

CI/CD pipeline success rate and execution time Deployment frequency and failure rate System uptime and reliability metrics (SLO adherence) Mean time to resolution (MTTR) Build agent availability and performance Automation coverage across deployment workflows, Reliable, scalable, and efficient deployment pipelines Reduced manual intervention in release processes High system availability and performance Strong alignment between engineering and operations Well-documented and repeatable operational processes, 5+ years of experience in DevOps, Site Reliability Engineering (SRE), or Platform Engineering roles. Strong experience with cloud platforms (Azure or AWS). Experience with Linux environments.

Hands-on experience with: Infrastructure as Code (Terraform or equivalent) CI/CD tools and pipeline development Experience designing and supporting highly available, fault-tolerant systems. Proficiency in Python, Bash, or similar scripting languages.

Working knowledge of: System architecture patterns Observability tools Cloud security best practices Experience with containerization (Docker) and orchestration (Kubernetes or similar). Demonstrated ability to learn and adopt new tools and technologies.

Key Performance Indicators (KPIs)

CI/CD pipeline success rate and execution time Deployment frequency and failure rate System uptime and reliability metrics (SLO adherence) Mean time to resolution (MTTR) Build agent availability and performance Automation coverage across deployment workflows, Reliable, scalable, and efficient deployment pipelines Reduced manual intervention in release processes High system availability and performance Strong alignment between engineering and operations Well-documented and repeatable operational processes ABOUT THE COMPANY

About the company

In a world with an ever-growing population with climate and sustainability challenges and with changing consumer demands, the global food industry is going through a profound transformation. Did you know that 815 million people go to bed hungry every night. At the same time, 1/3 of all food being produced globally is wasted. The Speria business is all about changing this. Not only to improve animal welfare, drive yield and sustainability for actors in the global food supply chain, but also to help to feed the world by enabling change in the way we farm and produce food. Our contribution is to create a digital ecosystem including data capture platforms as connected controllers and IoT and sensors, that together with predictive AI and real-time monitoring allow farmers and growers to improve animal welfare and maximize production while minimizing waste and Co2 emissions - ensuring a transparent and sustainable food production for a growing population. Speria - spearheading digitalization by providing innovative solutions enabling the green transition.

Apply for this position