DevOps Engineer

Ingram Marine Group

Nashville, United States of America

9 days ago

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Shift work

Languages

English

Experience level

Senior

Job location

Nashville, United States of America

Tech stack

Application Performance Management

Azure

System Configuration

Continuous Integration

Data Visualization

DevOps

Disaster Recovery

Monitoring of Systems

Nagios

Site Reliability Engineering Practices

Ansible

Prometheus

Message Oriented Middleware

Software Engineering

Cloud Platform System

Delivery Pipeline

Grafana

Reliability of Systems

Gitlab

Angular

Gitlab-ci

Kubernetes

Infrastructure Automation Frameworks

Deployment Automation

Azure

Front End Software Development

3-tier Architectures

Terraform

Job description

Ingram Marine Group is seekinga DevOps Engineer to join our dynamicDevSecOps Team in the Nashville, TN area. This person will work alongsideour Systems Architect, Application Development Architect, and SecurityEngineer and focuses on operationalizing our cloud-native infrastructure, enhancing CI/CD pipelines, ensuring system reliability and resilience, and providing24x7 operational support.

What you will be doing:

Pipeline& Automation

Designing and implementing advanced CI/CD pipelinefeatures using GitLab
Developing and maintaining Terraform modules for infrastructure provisioning
Creatingand optimizingAnsible playbooks for configurationmanagement and deployment automation
Integratingsecurity scanning and compliance checksinto deployment pipelines

Container& Kubernetes Operations

Building, configuring, and maintaining Azure Kubernetes Service (AKS) clusters
Developing and optimizingHelm charts for applicationdeployments
Implementing and managingGitOps workflows
Monitoringand troubleshooting containerized applications and cluster performance

Infrastructure & Reliability

Implementing Infrastructure as Code best practices using Terraformand Ansible
Designing and executingdisaster recovery procedures and business continuity plans
Performing system patching, upgrades, and maintenance activities
Establishing and maintaining comprehensive monitoring, alerting, and observability solutions using Prometheus and Grafana

Cost Optimization & ResourceManagement

Monitoring and analyzingAzurecloud spending patterns and resource utilization
Implementing cost optimization strategies including right-sizing, reserved instances, and auto-scaling policies
Developing dashboards and reports forcost tracking and forecasting
Collaboratingwith teams to optimize resource allocation and eliminatingwaste

Monitoring & Observability

Designing and implementing comprehensive monitoringsolutions using Prometheus for metrics collection
Building and maintaining Grafana dashboards for infrastructure, application, and business metrics
Configuringintelligent alerting rules and escalation procedures
Establishing SLIs, SLOs, and errorbudgets for critical services

24x7 Support & IncidentResponse

Participatingin on-call rotation for 24x7 production support
Leading Tier 3 incident response efforts for production outages and systemissues
Performing root cause analysis and implementing preventive measures
Collaboratingwith development teams onperformance optimization and troubleshooting
Maintaining runbooks and documentation foroperational procedures

Requirements

Knowledge, Skills, and Abilities:

Technical Expertise(5+ years)

Strong experience with Kubernetes(AKS preferred) and container orchestration
Proficiency in Infrastructure as Code: Terraform and Ansible
Advanced GitLab CI/CDpipeline development and optimization
Experience with GitOps methodologies and leading toolsets like Helm, Flux and/or ArgoCD
Pythonscripting for automation and pipeline tasks
Azurecloud services and networking concepts

Monitoring& Cost Management

Hands-on experience withPrometheus for metrics collection and alerting
Proficiency in Grafana for dashboard creation and data visualization
Experience with AzureCost Management tools and FinOps practices
Knowledge of resource optimization techniques and auto-scaling strategies
Understanding of cloud pricing models and cost allocation methods

DevOps & SRE Practices

Incidentmanagement and post-mortem processes
24x7on-call experience withescalation procedures
Disaster recovery planningand implementation
Securitybest practices in CI/CD and infrastructure
Experience with chaosengineering and resilience testing

Collaborative Skills

Experience working with cross-functional teams
Strongtroubleshooting and problem-solving abilities under pressure
Documentation and knowledge sharing practices, * Azure certifications (AZ-104, AZ-400, or AKS-related)
Experiencewith message bus systems (AzureService Bus)
Knowledge of.NET applications and Angular frontend deployments
Familiarity withsecret management solutions(Delinea or similar)
Experience with additionalmonitoring tools (Azure Monitor, Application Insights)
FinOps certificationor cost optimization experience
Experience with alerting tools and PagerDutyintegration

Role details

Job location

Tech stack

Job description

Requirements

Apply for this position

Good distractions

Moments

Videos View all