AIOps Engineer

CAREER LINK
Fort Belvoir, United States of America
2 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Intermediate
Compensation
$ 218K

Job location

Fort Belvoir, United States of America

Tech stack

Artificial Intelligence
Amazon Web Services (AWS)
Azure
Cloud Computing
Cloud Engineering
Computer Security
Data Security
Python
Machine Learning
Scripting (Bash/Python/Go/Ruby)
Data Ingestion
Grafana
Multi-Cloud
SolarWinds (Software)
Cybercrime
Cyber Warfare
Splunk
Api Management
Unsupervised Learning
ServiceNow
VMware

Job description

  • Cross-Functional Leadership: Lead the AIOps platform initiative by acting as the primary technical liaison to existing Network Engineering, ServiceNow, and SolarWinds administration teams to establish unified telemetry pipelines.
  • ITSM Orchestration & Automation: Architect closed-loop remediation workflows by deeply integrating Splunk ITSI alerts with ServiceNow Event Management and Incident Management modules.
  • Mission-Critical Observability: Architect and maintain Splunk AIOps solutions across unclassified and classified enclaves to provide real-time situational awareness.
  • Infrastructure Telemetry Integration: Normalize and correlate network performance and fault data from SolarWinds with server and application logs to provide a holistic view of enterprise health.
  • Advanced ML Development: Deploy custom machine learning models via Splunk MLTK to identify anomalous behavior, potential cyber threats, and infrastructure degradations.
  • Secure Data Integration: Engineer secure data ingestion pipelines for telemetry data from cross-domain solutions and tactical edge devices.
  • Incident Reduction: Utilize IT Service Intelligence (ITSI) to correlate multi-source events, reducing noise and prioritizing high-impact mission alerts.
  • Cyber Defense Support: Collaborate with the Cyber Security Service Provider (CSSP) to integrate AIOps insights into defensive cyber operations (DCO).
  • Compliance & Documentation: Ensure all observability tools comply with DoW STIGs and IL5/IL6 protocols; develop and maintain architectural documentation and compliance traceability.
  • Mission Alignment: Stay current on AIOps and related capabilities relevant to DoD, federal, and intelligence mission systems.

Requirements

Do you have experience in Unsupervised learning?, * Security Clearance: Active Top Secret / Sensitive Compartmented Information (TS/SCI) required at time of hire.

  • Certification: Active IAT Level II certification (e.g., Security+ CE, CySA+, GSEC, or SSCP) required.
  • Citizenship: United States Citizenship is required.
  • Platform Experience: 7+ years of experience with Splunk Enterprise, including architectural design, cluster management, and advanced Search Processing Language (SPL).
  • AIOps & ITSM: 3+ years of experience implementing AIOps workflows, including integration with enterprise ITSM solutions (ServiceNow) for automated root cause analysis and remediation.
  • Machine Learning: Proven track record of building, testing, and tuning supervised and unsupervised models within the Splunk MLTK.
  • Scripting & Automation: Advanced scripting skills for developing custom search commands, API integrations, and automating remediation tasks (e.g., Python).
  • Leadership: Experience leading technical working groups and directing the efforts of adjacent infrastructure and development teams.
  • Operational Experience: Prior experience working within a DoW/DoD Operations Center (NOC/SOC) or supporting mission-critical systems and networks.
  • Communication: Must be able to present designs, plans, and analyses of alternatives to technical leadership boards for approvals.

Desired Qualifications:

  • Enterprise Aggregation: Experience aggregating and correlating telemetry from diverse tools, specifically SolarWinds, ServiceNow, and VMware vCenter.
  • Expert Certification: Splunk Enterprise Certified Architect or Splunk ITSI Certified Admin.
  • Cloud Observability: Experience with Cloud Native Computing Foundation (CNCF) observability tools in secure hybrid multi-cloud environments (Azure/AWS).
  • RMF/ATO Knowledge: Understanding of the Risk Management Framework (RMF) and the Authorization to Operate (ATO) process for AI/ML workloads.

Benefits & conditions

Pulled from the full job description

  • Tuition reimbursement
  • 401(k)
  • Health insurance
  • 401(k) matching
  • Paid time off
  • Vision insurance
  • Dental insurance, * 401(k)
  • 401(k) matching
  • Dental insurance
  • Health insurance
  • Paid time off
  • Profit sharing
  • Training & development
  • Tuition assistance
  • Vision insurance

Apply for this position