Sr. Software Engineer, Product SRE
Role details
Job location
Tech stack
Job description
As a Senior Engineer (Typically equivalent to Staff or Sr Staff titles in other companies) in our Embedded Reliability team, you'll work directly within CrowdStrike product groups alongside product engineers and their leadership. You'll partner with engineering leaders to shape reliability roadmaps while doing hands-on work solving complex distributed systems problems at scale. This is hands-on systems engineering work focused on writing code, building foundational infrastructure, and solving complex problems rather than day-to-day operations or ticket management. While we embrace the SRE moniker, you'll find that it means something much more service-oriented at Crowdstrike, and affords you no shortage of Golang development initiatives, as well as the freedom to move up/down and laterally across the stack as & when needed. It is far and away our most self-driven & autonomous backend development role.
CrowdStrike Falcon processes trillions of events per day. You'll work on the critical production systems that power this platform by improving, rearchitecting, and scaling them to meet growing demands. You'll write production code, debug complex distributed systems issues, and tackle problems spanning scale and resiliency, performance engineering, foundational observability and instrumentation, cost optimization, and failure modeling.
Product engineers and engineering leaders will come to you for guidance on architectural decisions because you've earned credibility through hands-on work and delivering results. You'll ensure follow through on incident retrospectives with concrete improvements that eliminate entire classes of failures. You'll identify opportunities to extract common patterns into shared libraries or tools, or partner with platform teams on improvements that benefit multiple product groups. Recent examples from the team include resolving critical issues in leader election libraries and building infrastructure-as-code tools that eliminate manual deployment processes.
Why This Role Matters: CrowdStrike Falcon is the industry standard in cloud-native cybersecurity and threat hunting. Our customers depend on us to protect their businesses from sophisticated threats, and reliability isn't optional - it's fundamental to our mission. As an Embedded SRE, your work directly impacts whether organizations around the world can defend themselves against cyberattacks. You'll be working on problems that matter, at a scale that few companies can match, with the autonomy to make real architectural decisions.
What You'll Do:
- Partner with engineering leadership to define and drive reliability roadmaps
- Design and implement architectural improvements to services, libraries, and platforms that impact teams across CrowdStrike
- Establish foundational observability practices: ensure teams instrument services properly, react to signals effectively, and leverage observability to drive automation like continuous delivery
- Lead performance and cost optimization: profiling, bottleneck analysis, capacity planning, and efficiency improvements across cloud infrastructure
- Define and implement service-level objectives that drive decision-making and prioritization
- Conduct resilience engineering: chaos experiments, failure injection, and designing for graceful degradation
- Provide technical leadership during complex incidents and drive systemic improvements
- Mentor and coach engineers, building a culture of excellence and driving architectural standards across the organization
Requirements
- 7-10+ years building and operating distributed systems at scale
- Expert-level proficiency in at least one programming language; willingness to become proficient in Go
- Deep understanding of distributed systems: e.g. consensus algorithms, replication, consistency, failure modes, scalability patterns
- Proven experience scaling backend systems: e.g sharding, partitioning, horizontal scaling, capacity planning, performance optimization
- Track record of making impactful architectural decisions and seeing them through to production
- Strong systems thinking and ability to influence without direct authority across organizational boundaries
- Degree in Computer Science or equivalent experience in data structures/algorithms/distributed systems
Bonus Points:
- Experience driving reliability improvements in organizations with hundreds or thousands of microservices
- Deep knowledge of Kubernetes, cloud platforms, or other large-scale orchestration systems
- Experience with AWS, Cassandra, Kafka, OpenSearch, or similar large-scale distributed systems
- Track record of building internal platforms or tools that other engineers use
- Experience in infrastructure cost optimization at scale
- Background in performance engineering: profiling, optimization, understanding system bottlenecks
- Experience with chaos engineering or resilience testing practices
- History of establishing SLO/SLI frameworks and error budgets in production environments
- Background in cybersecurity or intelligence fields
- Experience building developer platforms or improving developer experience
#HTF, * Experience driving reliability improvements in organizations with hundreds or thousands of microservices
- Deep knowledge of Kubernetes, cloud platforms, or other large-scale orchestration systems
- Experience with AWS, Cassandra, Kafka, OpenSearch, or similar large-scale distributed systems
- Track record of building internal platforms or tools that other engineers use
- Experience in infrastructure cost optimization at scale
- Background in performance engineering: profiling, optimization, understanding system bottlenecks
- Experience with chaos engineering or resilience testing practices
- History of establishing SLO/SLI frameworks and error budgets in production environments
- Background in cybersecurity or intelligence fields
- Experience building developer platforms or improving developer experience
Benefits & conditions
CrowdStrike, Inc. is committed to fair and equitable compensation practices. Placement within the pay range is dependent on a variety of factors including, but not limited to, relevant work experience, skills, certifications, job level, supervisory status, and location. The base salary range for this position for all U.S. candidates is $140,000 - $215,000 per year, with eligibility for bonuses, equity grants and a comprehensive benefits package that includes health insurance, 401k and paid time off.