Observability Systems Engineer
Role details
Job location
Tech stack
Job description
As an Observability Systems Engineer, the work you'll do at GDIT will be impactful to the mission of USCENTCOM. You will play a crucial role in ensuring the performance, reliability, and visibility of mission critical applications, networks, and infrastructure. You will design, implement, and maintain observability solutions that deliver real time insights across complex distributed systems, enabling rapid issue detection, improved operational readiness, and enhanced mission success., * Design and implement APM solutions to monitor and optimize application performance across CENTCOM systems.
- Analyze application behavior, identify bottlenecks, and provide actionable recommendations to improve performance and reliability.
- Develop and maintain dashboards, alerts, and reports to track key performance indicators (KPIs).
End User Experience Monitoring
- Deploy tools and methodologies to measure end user interactions with applications and services.
- Analyze user experience metrics including response times, error rates, and service availability.
- Collaborate with Development, Network, Cyber, and System Operations teams to enhance user experience and resolve mission impacting issues.
Network Performance Monitoring
- Implement and manage network performance monitoring platforms to ensure optimal network health.
- Monitor traffic, latency, and throughput to identify and resolve performance issues.
- Provide insights into network behavior and recommend improvements to enhance reliability and scalability.
Wire Traffic Monitoring
- Deploy and maintain wire level monitoring solutions to capture and analyze network packets.
- Identify anomalies, troubleshooting issues, and ensure secure, efficient data transmission.
- Leverage packet level data to support incident response and root cause analysis.
Observability Tools & Integration
- Configure, maintain, and optimize observability platforms including Dynatrace, AppDynamics, Riverbed Alluvio Suite, Splunk ITSI, and SolarWinds.
- Support stakeholders in defining observability requirements and integrating monitoring tools into existing workflows.
- Develop custom scripts, plugins, and integrations to extend monitoring capabilities.
Tools & Technologies
- Dynatrace, AppDynamics, and Riverbed Alluvio Suite for full-stack application and network performance monitoring.
- Splunk Enterprise / Splunk ITSI for log analytics, event correlation, and service health monitoring.
- SolarWinds and NetScout for network performance, device monitoring, and packet-level visibility.
- Prometheus and Grafana for metrics collection and visualization in containerized or DevSecOps environments.
- Zeek, Suricata, and Wireshark for wire-data analysis, packet inspection, and network anomaly detection
Proactive Monitoring & Incident Response
- Establish proactive monitoring practices to detect and address issues before they impact mission operations.
- Work with cross functional teams to investigate and resolve incidents with minimal downtime.
- Deliver detailed post incident analysis and recommendations for future prevention.
Documentation & Knowledge Sharing
- Create and maintain documentation for observability tools, processes, and best practices.
- Train and mentor team members on observability methodologies and toolsets.
Requirements
Bring your technology expertise and drive for innovation to GDIT. The Observability Engineer /Systems Engineer Senior must have:
- Certification: Security + CE or higher (DoW 8140 compliant)
- Experience: 5+ years of related work experience
Required Technical Skills:
- Strong back-end engineering capabilities, including building, testing, and validating systems in controlled environments prior to deployment on production networks.
- Strong experience with observability platforms such as Dynatrace, AppDynamics, Riverbed Alluvio Suite, Splunk ITSI, and SolarWinds.
- Proficiency in APM, end user experience monitoring, network performance monitoring, and wire traffic analysis.
- Hands on experience with network protocols, packet analysis, and traffic monitoring tools.
- Familiarity with scripting and automation (Python, PowerShell, Bash).
- Strong analytical and troubleshooting skills across application, network, and infrastructure layers.
- Excellent communication skills with the ability to convey technical insights to diverse audiences.
- Demonstrated ability to collaborate with developers, network engineers, cybersecurity teams, and operations personnel.
- Security clearance level: TS/SCI clearance required
- US citizenship required due to clearance requirement
Desired Skills:
- Advanced certifications such as Microsoft Certified: Azure Administrator or CompTIA Server+.
- Certifications in observability tools (Dynatrace Associate/Professional, AppDynamics Implementation Professional, Splunk Core User/Power User).
- Experience designing and implementing observability strategies for enterprise distributed systems., Years of Experience
5 + years of related experience
- may vary based on technical training, certification(s), or degree Certification
CompTIA Security+ CE | CompTIA - CompTIA Travel Required
Less than 10% Citizenship
Benefits & conditions
At GDIT, the mission is our purpose, and our people are at the center of everything we do.
- Growth: AI-powered career tool that identifies career steps and learning opportunities
- Support: An internal mobility team focused on helping you achieve your career goals
- Rewards: Comprehensive benefits and wellness packages, 401K with company match, competitive pay and paid time off
- Community: Award-winning culture of innovation and a military-friendly workplace, The likely salary range for this position is $110,500 - $149,500. This is not, however, a guarantee of compensation or salary. Rather, salary will be set based on experience, geographic location and possibly contractual requirements and could fall outside of this range.