Mainframe Automation
Role details
Job location
Tech stack
Job description
- Own the mainframe automation strategy and roadmap to improve reliability, detection, and recovery.
- Scale automated solutions across infrastructure domains (batch, storage, networking, middleware) using APIs, orchestration, and infrastructure-as-code.
- Architect and govern multi-site failover automation; maintain and test DR playbooks and runbooks.
- Define and operationalize SLOs/SLIs, error budgets, and alerting standards; reduce MTTA/MTTR through event correlation and automated remediation.
- Instill disciplined engineering: peer reviews, version control, change management gates, and automation standards aligned to risk/compliance.
- Build and lead a high-performing team; develop talent in REXX, z/OS automation, DevOps, and integration.
- Partner across platforms, applications, cyber, risk, and compliance to prioritize automation investments that reduce toil and operational risk.
- Own KPIs and continuous improvement cycles; communicate performance, risks, and outcomes to senior stakeholders.
Responsibilities:
-
Technical leadership and hands-on guidance in:
-
Mainframe automation frameworks and system state management
-
Expert-level REXX scripting; deep z/OS operating environment expertise
-
Designing/supporting automated failover in multi-site environments
-
RESTful APIs and systems integration for orchestration
-
DevOps tooling (e.g., Jenkins including Zowe plug-in, Ansible) and CI/CD for mainframe workloads
-
ServiceNow ITSM workflows and automation
-
AIOps/event-management platforms (e.g., Moogsoft or equivalent)
-
SQL, JCL, ISPF/TSO and related toolchains
Governance, risk, and controls:
- Strong change control discipline (Git, peer reviews, deployment safeguards)
- High-quality documentation: runbooks, process/architecture standards
- Alignment with audit and compliance requirements
Stakeholder and vendor management:
- Partner with infrastructure leaders and vendors; drive outcomes via SLAs and measurable KPIs
- Translate functional requirements into automation workflows; steward solution documentation and training
Operations ownership:
- Troubleshoot and drive root-cause fixes; prepare detailed problem reports
- Resilience/DR: hands-on experience with GDPS and Safeguarded Copy
- Collaborate on performance tuning, capacity planning, and cost efficiency
Leadership Expectations:
- Lead engineers/SREs with clear objectives, coaching, and succession planning; foster documentation and knowledge sharing.
- Shape automation standards across technology services group ; align priorities to business outcomes and risk posture.
- Present strategy, results, and risks to senior technology leadership; influence cross-platform adoption.
- Practice financial stewardship for tooling, licensing, and infrastructure; identify savings via automation and improved change success rates.
Requirements
- Extensive experience leading mainframe automation in large, multi-site enterprises
- Expert REXX and deep z/OS knowledge with a track record of robust automation delivery
- Demonstrated success implementing resilience/DR automation (GDPS, Safeguarded Copy)
- Proficiency with DevOps tooling (Jenkins incl. Zowe plug-in, Ansible) and REST API integrations
- ServiceNow ITSM automation; familiarity with AIOps/event correlation platforms
- Strong SQL, JCL, ISPF/TSO; ability to mentor across these domains
- Proven change management discipline (Git, peer reviews) and excellence in documentation/runbooks
- Exceptional communication skills for complex topics, training, and stakeholder engagement
- Experience owning KPIs, quarterly improvement cycles, and executive-level reporting