Principal Site Reliability Engineer (DNS Security)
Role details
Job location
Tech stack
Job description
We are seeking development-heavy Site Reliability Engineers (SREs) who are passionate about bringing new ideas to all facets of DevOps and operational excellence within our Internet security infrastructure product portfolio.
This key role requires strong technical expertise to design, build, maintain, and scale production services and server farms, with an emphasis on applying development skills to automate and optimize complex infrastructures for large-scale global product deployments across multiple cloud platforms.
As an SRE, you will take end-to-end ownership of your focus areas, driven to solve problems at every level of the stack, and participate in all phases of the product lifecycle-from initial design and development through testing and deployment-while collaborating seamlessly with multi-functional teams, including developers, product managers, and QA, to achieve a common operational goal.
Your Impact
- Build Terraform to deploy infrastructures and services to multiple cloud platforms.
- Build automation for provisioning and operating infrastructure at a massive scale using Python or Go code
- Work with Dev/QA teams to build pipelines and automation for delivering and deploying applications to production
- Build observation (logging, metrics, alerting) systems to make sure system works well.
- Design and implement the infrastructure to ensure applications align with infrastructure requirements, focusing on scalability and reliability
- Collaborate with PMs to deliver compliances (SOC2, Fedramp, IL5) and establish a vision for continuous improvement.
- On-call Support and Incident Resolution
- Participate in occasional on-call rotations to support the infrastructure
- Investigate incidents, formulate hypotheses, and identify root causes to solve issues promptly
- Write postmortem reviews and provide remediation recommendations
Requirements
- Bachelor's or higher degree in Computer Science, Engineering, or related field or equivalent military experience required
- 6+ years of experience in DevOps, SRE, or related roles.
- Cloud Experiences: GCP/AWS/OCI/Azure.
- Container Docker, Kubernetes operational experiences.
- Knowledge of TCP/IP, DNS, HTTP, GRPC
- Proven experience in designing, implementing, and maintaining scalable and reliable infrastructure
- Strong proficiency in automation scripting and infrastructure as code (IaC)
- Excellent problem-solving skills and the ability to troubleshoot complex issues
- Effective communication skills, both written and verbal
- Experience working in collaborative, cross-functional environments
- Python/Go/Rust programming
Additional Requirements
- Availability for on-call support
- Willingness to engage with customers directly and represent the technical aspects of the product
Benefits & conditions
The compensation offered for this position will depend on qualifications, experience, and work location. For candidates who receive an offer at the posted level, the starting base salary (for non-sales roles) or base salary + commission target (for sales/com-missioned roles) is expected to be the annual range listed below. The offered compensation may also include restricted stock units and a bonus. A description of our employee benefits may be found here (https://benefits.paloaltonetworks.com/) .
$151,600.00 - $245,300.00/yr
Our Commitment
We're trailblazers that dream big, take risks, and challenge cybersecurity's status quo. It's simple: we can't accomplish our mission without diverse teams innovating, together.