SAP NS2 DevOps Engineering Manager (Herndon, VA preferred)
Role details
Job location
Tech stack
Job description
This position requires access to customer data; Must be a U.S. citizen. SAP NS2 does not offer Visa sponsorships for this role
All internals must have Manager's approval to transfer., Our Infrastructure Build and Operations organization is responsible for building and operating secure, reliable environments for our customers. We are an automation-first organization: our teams use Terraform and Ansible to provision, configure, and manage customer environments across multiple clouds.
We are looking for a DevOps Engineering Manager to lead the team that builds and evolves the "control plane" for Infrastructure Build and Operations - owning how we orchestrate, standardize, and automate the execution of our build process at scale. This role focuses on automating how we build and operate customer as well as day-2 operations (changes, patching, scaling, and lifecycle management). The ideal candidate combines deep infrastructure-as-code expertise with strong people leadership and a product mindset toward internal platforms and tooling.
RESPONSIBILITIES
-
Lead and develop a high-performing engineering team within Infrastructure Build and Operations focused on building and operating automation (platforms, pipelines, and tooling that execute Terraform and Ansible at scale).
-
Own the end-to-end strategy for how Infrastructure Build and Operations automates both building and operating customer environments, including workflow orchestration, pipeline design, and integration with ticketing, approvals, observability, and run operations.
-
Design and oversee implementation of standardized patterns, modules, and frameworks for automated execution of Terraform and Ansible, enabling other build/operations teams to deliver consistent, secure environment builds and routine operations.
-
Collaborate closely with environment build teams, operations teams, security, and product stakeholders to understand requirements and translate them into control-plane capabilities, reusable automation, and self-service workflows for provisioning and ongoing operations.
-
Build and operate robust automation pipelines that orchestrate Terraform and Ansible runs for both initial provisioning and operational changes, including plan/apply workflows, approvals, rollbacks, drift detection, and change reporting.
-
Promote and guide adoption of generative AI and other automation tooling to streamline environment builds, incident response, troubleshooting, and routine operational tasks, improving quality and reducing manual effort.
-
Oversee implementation and continuous improvement of monitoring, logging, and audit capabilities for Terraform and Ansible executions, providing full traceability of who changed what, where, and when across customer environments.
-
Establish and enforce standards for documentation, runbooks, and onboarding materials so that other Infrastructure Build and Operations teams can effectively and safely consume the control-plane services for both build and operations use cases.
-
Partner with leadership to communicate status, risks, and outcomes for major automation and operations-improvement initiatives, ensuring alignment across Infrastructure Build and Operations and adjacent teams.
Requirements
-
Demonstrated leadership and stakeholder management skills; able to collaborate, influence, and drive adoption of common platforms and standards across multiple teams.
-
Experience with Terraform and Ansible
-
Proficiency in Python and shell scripting (e.g., Bash) for building automation tooling, pipeline integrations, and utility services around Terraform and Ansible.
-
Background in Linux OS deployment, configuration, troubleshooting, and performance tuning in production environments.
-
Solid understanding of SDLC and modern DevOps practices, including Git workflows, CI/CD, trunk-based development, automated testing for infrastructure-as-code, and release management of operational changes.
-
Strong knowledge of foundational services in at least one major cloud provider (AWS, Azure, or GCP) and the ability to reason about multi-cloud patterns and controls for environment automation and operations.
-
Strong investigation and debugging skills for failures, Terraform/Ansible execution issues, and cross-environment configuration problems impacting both provisioning and day-2 operations.
-
Experience with centralized logging, metrics, and observability for automation platforms, IaC executions, and operational events in customer environments.
-
Working knowledge of networking (IP routing, subnetting, DNS, load balancing) sufficient to guide teams building network-related IaC and troubleshoot environment issues
-
Experience with ticket management (e.g., ServiceNow), including integrating automation workflows with change, incident, problem, and request processes., * Bachelor's degree in Computer Science or equivalent practical experience. 10+ years of experience in cloud infrastructure, automation, or DevOps roles, with significant hands-on work in Terraform and Ansible.
-
3+ years of experience leading engineering teams (people management or formal tech lead) in DevOps, SRE, platform, or infrastructure-focused organizations.
-
5+ years of Terraform experience, including designing and operating production-grade IaC for complex, multi-environment or multi-tenant use cases.
-
5+ years of Ansible experience, including building reusable roles/collections and integrating Ansible with CI/CD pipelines and operational workflows.
-
Demonstrated experience designing and delivering automation platforms, pipelines, or frameworks consumed by multiple teams for both provisioning and ongoing operations.
-
Strong understanding of core networking: TCP/IP, DNS, routing, and cloud connectivity.
-
Candidates based within commuting distance to the Herndon, VA office are strongly preferred. In office expecectations for local candidates are 2-3 times a week.
-
Must be able to travel quarterly for team meetings.
Preferred qualifications
-
Hands-on experience architecting or operating internal control planes / platform services that orchestrate Terraform, Ansible, or similar IaC tools for both build and operations.
-
Hands-on experience with AI/ML tooling or leveraging generative AI to improve engineering workflows, operational runbooks, and automation quality.
-
Experience building and operating large-scale distributed systems or shared platform services in cloud or hybrid environments.
-
Strong written and verbal communication skills with a track record of producing clear technical documentation, design proposals, and executive-ready status updates.