Staff Site Reliability Engineer (SRE)
Xometry
North Bethesda, United States of America
2 days ago
Role details
Contract type
Permanent contract Employment type
Full-time (> 32 hours) Working hours
Regular working hours Languages
English Experience level
Intermediate Compensation
$ 190KJob location
North Bethesda, United States of America
Tech stack
JavaScript
Amazon Web Services (AWS)
Continuous Integration
Github
Monitoring of Systems
Python
Shell
Reliability Engineering
Software Engineering
Kubernetes
Sentry
Terraform
Docker
Job description
- Lead the resolution of the most complex, cross-cutting technical challenges and drive them to completion, ensuring alignment with organizational goals.
- Define the long-term technical roadmap and strategy for core SRE platforms and services.
- Architect, scope, and drive delivery of major engineering initiatives, balancing trade-offs between effort, risk, and long-term business impact.
- Set the standard for clean, efficient, and well-documented code, focusing on core system improvements and features.
- Collaborate effectively across teams, communicating clearly on progress, blockers, and outcomes to technical and executive stakeholders.
- Provide technical mentorship to engineers across the organization and drive organization-wide adoption of best practices for operational excellence, observability, and infrastructure security.
- Demonstrate accountability, curiosity, and continuous improvement in all aspects of your work.
- Develop, configure, and maintain underlying platforms for deployed software (AWS accounts and networking, kubernetes clusters, and similar systems), focusing on architectural design and scaling.
- Develop, configure, and maintain observability and monitoring tools (Coralogix, Sentry, etc.), including defining SLOs and SLIs.
- Develop, configure, and maintain software development (CI/CD) tools (github actions runners, ArgoCD, etc).
Requirements
- 8+ years of professional experience in infrastructure management or backend software development experience in a fast-paced, product-driven environment.
- Demonstrated deep technical expertise in one or more of the following languages: Python, Javascript, or Unix Shell.
- Extensive experience with AWS, including architecting, deploying, monitoring, and scaling production workloads across multiple accounts and regions.
- Deep expertise with Terraform, Kubernetes, advanced CI/CD pipelines, and Docker/containerization best practices.
- Proven experience leading technical projects with significant cross-functional impact.
- Comfortable working in an operational environment, including acting as a point of escalation for complex issues outside of the on-call schedule.
- Excellent communication and collaboration skills, comfortable engaging with technical and executive stakeholders.
Benefits & conditions
The estimated base salary range for new hires into this role is $165,000 - $190,000 annually + commission depending on factors such as job-related skills, relevant experience, and location. We also offer a competitive benefits package, including 401(k) match, medical, dental and vision insurance; life and disability insurance; generous paid time off including vacation, sick leave, floating and fixed holidays, maternity and bonding leave; EAP, other wellbeing resources; and much more.
About the company
Xometry (NASDAQ: XMTR) powers the industries of today and tomorrow by connecting the people with big ideas to the manufacturers who can bring them to life. Xometry's digital marketplace gives manufacturers the critical resources they need to grow their business while also making it easy for buyers at Fortune 1000 companies to tap into global manufacturing capacity.
Xometry is seeking a Staff Site Reliability Engineer to join our Site Reliability Engineering (SRE) Organization. In this role as a senior individual contributor, you will define the technical strategy and direction for the reliability and performance of our infrastructure and software systems. You will lead complex, cross-functional initiatives, mentor senior engineers, and influence key architectural and business decisions across our technology organization. You will utilize your deep technical expertise to architect and implement highly reliable and scalable infrastructure solutions that empower our technology organization to quickly and safely deliver features to our customers at scale.