DevOps Engineer

Ingram Marine Group
Nashville, United States of America
9 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Shift work
Languages
English
Experience level
Senior

Job location

Nashville, United States of America

Tech stack

Application Performance Management
Azure
System Configuration
Continuous Integration
Data Visualization
DevOps
Disaster Recovery
Monitoring of Systems
Nagios
Site Reliability Engineering Practices
Ansible
Prometheus
Message Oriented Middleware
Software Engineering
Cloud Platform System
Delivery Pipeline
Grafana
Reliability of Systems
Gitlab
Angular
Gitlab-ci
Kubernetes
Infrastructure Automation Frameworks
Deployment Automation
Azure
Front End Software Development
3-tier Architectures
Terraform

Job description

Ingram Marine Group is seekinga DevOps Engineer to join our dynamicDevSecOps Team in the Nashville, TN area. This person will work alongsideour Systems Architect, Application Development Architect, and SecurityEngineer and focuses on operationalizing our cloud-native infrastructure, enhancing CI/CD pipelines, ensuring system reliability and resilience, and providing24x7 operational support.

What you will be doing:

Pipeline& Automation

  • Designing and implementing advanced CI/CD pipelinefeatures using GitLab

  • Developing and maintaining Terraform modules for infrastructure provisioning

  • Creatingand optimizingAnsible playbooks for configurationmanagement and deployment automation

  • Integratingsecurity scanning and compliance checksinto deployment pipelines

Container& Kubernetes Operations

  • Building, configuring, and maintaining Azure Kubernetes Service (AKS) clusters

  • Developing and optimizingHelm charts for applicationdeployments

  • Implementing and managingGitOps workflows

  • Monitoringand troubleshooting containerized applications and cluster performance

Infrastructure & Reliability

  • Implementing Infrastructure as Code best practices using Terraformand Ansible

  • Designing and executingdisaster recovery procedures and business continuity plans

  • Performing system patching, upgrades, and maintenance activities

  • Establishing and maintaining comprehensive monitoring, alerting, and observability solutions using Prometheus and Grafana

Cost Optimization & ResourceManagement

  • Monitoring and analyzingAzurecloud spending patterns and resource utilization

  • Implementing cost optimization strategies including right-sizing, reserved instances, and auto-scaling policies

  • Developing dashboards and reports forcost tracking and forecasting

  • Collaboratingwith teams to optimize resource allocation and eliminatingwaste

Monitoring & Observability

  • Designing and implementing comprehensive monitoringsolutions using Prometheus for metrics collection

  • Building and maintaining Grafana dashboards for infrastructure, application, and business metrics

  • Configuringintelligent alerting rules and escalation procedures

  • Establishing SLIs, SLOs, and errorbudgets for critical services

24x7 Support & IncidentResponse

  • Participatingin on-call rotation for 24x7 production support

  • Leading Tier 3 incident response efforts for production outages and systemissues

  • Performing root cause analysis and implementing preventive measures

  • Collaboratingwith development teams onperformance optimization and troubleshooting

  • Maintaining runbooks and documentation foroperational procedures

Requirements

Knowledge, Skills, and Abilities:

Technical Expertise(5+ years)

  • Strong experience with Kubernetes(AKS preferred) and container orchestration

  • Proficiency in Infrastructure as Code: Terraform and Ansible

  • Advanced GitLab CI/CDpipeline development and optimization

  • Experience with GitOps methodologies and leading toolsets like Helm, Flux and/or ArgoCD

  • Pythonscripting for automation and pipeline tasks

  • Azurecloud services and networking concepts

Monitoring& Cost Management

  • Hands-on experience withPrometheus for metrics collection and alerting

  • Proficiency in Grafana for dashboard creation and data visualization

  • Experience with AzureCost Management tools and FinOps practices

  • Knowledge of resource optimization techniques and auto-scaling strategies

  • Understanding of cloud pricing models and cost allocation methods

DevOps & SRE Practices

  • Incidentmanagement and post-mortem processes

  • 24x7on-call experience withescalation procedures

  • Disaster recovery planningand implementation

  • Securitybest practices in CI/CD and infrastructure

  • Experience with chaosengineering and resilience testing

Collaborative Skills

  • Experience working with cross-functional teams

  • Strongtroubleshooting and problem-solving abilities under pressure

  • Documentation and knowledge sharing practices, * Azure certifications (AZ-104, AZ-400, or AKS-related)

  • Experiencewith message bus systems (AzureService Bus)

  • Knowledge of.NET applications and Angular frontend deployments

  • Familiarity withsecret management solutions(Delinea or similar)

  • Experience with additionalmonitoring tools (Azure Monitor, Application Insights)

  • FinOps certificationor cost optimization experience

  • Experience with alerting tools and PagerDutyintegration

Apply for this position