Senior Platform Operations Manager

Everforth Ecs
Fairfax, United States of America
2 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior

Job location

Fairfax, United States of America

Tech stack

Audit Trail
Disaster Recovery
Monitoring of Systems
Systems Architecture
Data Logging
Grafana
Performance Monitor
Cloudwatch
Splunk

Job description

The Senior Platform Operations Manager provides senior-level operational leadership ensuring the reliability, continuity, and mission readiness of WDP platform services across classified and unclassified environments. This role is central to sustaining uninterrupted platform availability, driving disciplined continuity planning, and maintaining rigorous operational governance in direct support of DoW mission objectives.

  • Directs enterprise platform operations by managing operational reliability, service performance, and mission continuity activities across Department of War environments.
  • Monitors, tracks, and reports platform Service Level Objectives and key performance parameters using operational monitoring and logging tools such as Amazon CloudWatch, Amazon CloudTrail, Grafana, and Splunk.
  • Oversees backup and restoration operations by developing detailed procedures, validating completion records, coordinating storage policies, and documenting restoration metrics for all supported environments.
  • Leads continuity of operations by developing, maintaining, and executing comprehensive continuity plans that identify critical mission functions, define recovery objectives, prioritize resources, and document end-to-end continuity procedures across platform services.
  • Designs, implements, and conducts disaster recovery and continuity of operations exercises, including tabletop simulations that assess restoration readiness, assess failure modes, validate procedural accuracy, and capture mission-impact trends.
  • Produces after-action reports documenting findings, corrective actions, and improvement opportunities aligned with Department of War mission readiness requirements.
  • Coordinates continuity communications with program leadership and Government stakeholders by preparing operational briefings, maintenance notifications, and restoration updates.
  • Maintains system architecture diagrams, continuity plans, operational checklists, and process documentation supporting audit readiness, traceability, and configuration accuracy.
  • Leads operational process improvement initiatives by analyzing performance trends, identifying continuity risks, and guiding corrective actions that enhance platform resilience, operational predictability, and continuity of mission operations.
  • Delivers disciplined operational leadership supporting uninterrupted platform availability and sustained mission assurance.
  • Performs other duties as assigned.

Requirements

Do you have experience in System performance monitoring?, * Current Secret security clearance with the ability to obtain and maintain a Top Secret (TS) security clearance with Sensitive Compartmented Information (SCI).

  • 10 or more years of progressive experience in enterprise IT platform operations, with demonstrated responsibility for service reliability, continuity of operations planning, and disaster recovery program management in classified or federal government environments.
  • Hands-on experience designing and executing Continuity of Operations (COOP) and Disaster Recovery (DR) programs, including tabletop exercises, after-action reporting, and recovery objective validation across multi-enclave or multi-classification cloud environments.
  • Demonstrated proficiency with enterprise monitoring and logging platforms such as Amazon CloudWatch, Amazon CloudTrail, Grafana, and Splunk, including configuration of dashboards, alerting thresholds, and Service Level Objective tracking in support of operational performance reporting.
  • Experience maintaining operational documentation in support of audit readiness and configuration traceability, including system architecture diagrams, continuity plans, backup and restoration records, and operational checklists in a DoW or federal government context.
  • Strong problem-solving and decision-making capabilities, with a proven ability to weigh the relative costs and benefits of potential actions and identify the most appropriate solution.
  • Highly developed interpersonal and oral/written communication skills, with the ability to effectively and professionally interact with a diverse set of stakeholders (from peers to end-users to executive management).

Apply for this position