Senior DevOps Engineer
Role details
Job location
Tech stack
Job description
We're looking for a Sr. DevOps Engineer to join our team! Reporting to the Director of TechOps, this role is responsible for the design, implementation, administration, and continuous improvement of secure, scalable, and highly available AWS-based infrastructure and supporting operational services. This role works closely with engineering, security, and operations teams to enhance deployment automation, strengthen system reliability, improve observability, and support efficient delivery of business-critical applications and services.
This position is instrumental in advancing infrastructure automation, Azure DevOps pipeline maturity, Datadog monitoring and alerting, cloud security, disaster recovery, and business continuity readiness. The Senior DevOps Engineer also helps ensure that infrastructure solutions are resilient, operationally effective, and aligned with business objectives, growth, and cost optimization goals.
What You'll Do:
- Design, implement, and continuously improve secure, scalable, and highly available AWS-based cloud infrastructure and supporting services.
- Build, manage, and optimize Infrastructure as Code (IaC) solutions for AWS environments, including networking, compute, storage, and related services.
- Develop, maintain, and enhance Azure DevOps pipelines to improve deployment speed, consistency, reliability, and quality across development, test, and production environments.
- Automate infrastructure provisioning, configuration management, system maintenance, and operational workflows to reduce manual effort and improve operational efficiency.
- Monitor infrastructure, application health, logging, alerting, and service availability using Datadog; proactively identify trends, risks, and issues to minimize outages and improve service reliability.
- Partner with engineering, security, and operations teams to support application deployments, infrastructure changes, production readiness, and incident response.
- Implement and maintain disaster recovery, backup, resiliency, and business continuity capabilities to support operational readiness and recovery objectives.
- Strengthen cloud security by supporting identity and access management, patching, vulnerability remediation, system hardening, and adherence to established policies and standards.
- Evaluate existing infrastructure and operational practices and recommend improvements in scalability, performance, reliability, security, and cost optimization.
- Monitor and optimize AWS resource utilization and cloud spend while maintaining performance, availability, and scalability requirements.
- Perform capacity planning and forecast infrastructure resource requirements to support current and future business needs.
- Provide technical leadership, mentorship, and guidance through collaboration, documentation, knowledge sharing, and operational best practices.
- Participate in troubleshooting, root cause analysis, and post-incident remediation efforts to improve overall system stability and resilience.
- Work with third-party vendors and service providers as needed to support infrastructure, tooling, and service delivery objectives.
- Contribute to shared team and organizational objectives through strong cross-functional partnership, communication, and execution.
Requirements
Do you have experience in Windows?, * Bachelor's degree in Computer Science, Information Technology, Engineering, or a related field, or equivalent combination of education and practical experience.
- 7+ years of experience designing, implementing, and supporting production infrastructure in AWS cloud environments.
- Hands-on experience with core AWS services such as EC2, VPC, S3, CloudFront, Route 53, IAM, and Lambda.
- Strong experience building, maintaining, and improving deployment automation and CI/CD pipelines, preferably using Azure DevOps.
- Strong hands-on experience with Datadog for infrastructure and application monitoring, alerting, dashboards, logging, and proactive operational visibility.
- Proven ability to use monitoring and observability data to identify trends, troubleshoot issues, improve service reliability, and reduce incident response time.
- Experience supporting high-availability, high-traffic, and customer-facing web or ecommerce platforms.
- Strong experience administering Windows Server and Amazon Linux environments in production. Experience with Microsoft Active Directory and enterprise identity/access management.
- Experience supporting database infrastructure such as Microsoft SQL Server and MySQL, including availability, performance, and operational considerations.
- Strong understanding of core infrastructure disciplines, including networking, compute, storage, systems administration, and cloud architecture.
- Experience supporting web and application hosting platforms, including IIS and Apache.
- Strong understanding of security best practices, including identity and access management, patching, system hardening, backup, disaster recovery, and business continuity.
- Experience troubleshooting complex production issues and participating in incident response, root cause analysis, and service restoration.
- Ability to evaluate and optimize infrastructure for scalability, resiliency, operational efficiency, and cost management.
- Experience working cross-functionally with engineering, security, and operations teams in a fast-paced production environment., * Experience with additional deployment or source control platforms such as Bitbucket or AWS CodeDeploy.
- Experience with scripting and automation using PowerShell, Python, or Bash.
- Experience with cloud cost optimization, usage analysis, and FinOps-related practices.
- Experience supporting regulated, business-critical, or revenue-generating production environments.
- Familiarity with modern release management, change control, and operational best practices.
Working Conditions:
- Continuous work on a provided laptop and equipment.
- Ability to work from home including stable internet connection.
- Occasionally travel to headquarter office if necessary.
Benefits & conditions
Pulled from the full job description
- Health insurance
- 401(k) matching
- Paid time off
- Vision insurance
- Dental insurance
- Life insurance
- Work from home, The starting base pay for this position is $130,000 - $150,000 per year. The actual base pay offered may vary based on the candidate's job-related knowledge, skills, experience, and geographic location.
In addition to the base pay, this position may be eligible for a discretionary annual bonus as part of the compensation package.
Outlook Amusements offers a robust benefits package including company subsidized Medical, Dental, and Vision plans, 401K with an employer match, Life Insurance, Paid Time Off and more!