Secret Elastic SRE / Observability Engineer
Role details
Job location
Tech stack
Job description
-
Operate and support large-scale Elastic Stack environments across search, logging, and observability use cases.
-
Ensure platform reliability, uptime, scalability, and performance across mission-critical production systems.
-
Manage and maintain Kubernetes-based Elastic deployments (ECK preferred).
-
Develop automation for deployment, monitoring, and incident response workflows.
-
Integrate Elastic Stack with SIEM and security tools such as Splunk, EDR platforms, and telemetry systems.
-
Troubleshoot complex issues across distributed systems and resolve production incidents.
-
Implement and enhance observability frameworks including logs, metrics, and tracing.
-
Support CI/CD pipelines and infrastructure-as-code practices.
-
Maintain runbooks, documentation, and escalation procedures for operational continuity.
-
Participate in on-call rotations and incident response activities.
-
Update and manage data collectors across clusters and coordinate cluster maintenance activities.
-
Create visualizations, dashboards, and monitoring tools to track performance and network behavior.
Requirements
Piper Companies is seeking an Elastic SRE / Observability Engineer to support a federal technology and defense environment focused on mission-critical platform reliability and security. The Elastic SRE / Observability Engineer role is ideal for an experienced engineer with strong expertise in Elastic Stack, Kubernetes, and site reliability engineering who thrives in highly secure, production-grade environments., * Strong experience with Elastic Stack in enterprise or production environments.
-
Hands-on expertise with Kubernetes; ECK operator experience highly preferred.
-
Strong Linux/Unix administration and networking fundamentals.
-
Proven experience in SRE or DevOps roles supporting production systems.
-
Experience building and supporting observability and monitoring frameworks.
-
Ability to troubleshoot distributed system issues within secure environments.
-
Experience with infrastructure automation tools such as Terraform or Ansible preferred.
-
Familiarity with SIEM and EDR tools such as Splunk, CrowdStrike, or Trellix is a plus.
-
Experience working in federal, defense, or regulated environments preferred.
-
Strong incident response mindset with the ability to perform under pressure.
-
Excellent communication skills and ability to collaborate across cross-functional technical teams.
-
Ability to work onsite in a hybrid environment (Hanscom AFB, MA or Langley AFB, VA).
-
Eligibility to obtain or maintain a Secret security clearance required.
Benefits & conditions
-
Salary range: $180,000 - $200,000
-
Comprehensive benefits package including medical, dental, vision, and 401(k)
-
Hybrid work schedule (minimum 3 days onsite per week)
-
Opportunity to support a high-impact federal program within a secure environment