Software Engineer - SRE
Role details
Job location
Tech stack
Job description
Verrus is looking for candidates to serve as software-focused Senior Site Reliability Engineer at Verrus. This is a Full time position based out of the Mountain View, CA office.
Verrus takes a very technology-forward approach to designing data center infrastructure - we operate at the intersection of leading-edge electrical and mechanical engineering, control systems, and software engineering. As a result, software development at Verrus spans a broad space, including integration with physical infrastructure as well as third-party vendors and systems. The Software Infrastructure team touches every aspect of Verrus, from integration with corporate IT to the data center systems.
In this role, you will sit at the intersection of software engineering and critical infrastructure operations. Your primary objective will be to treat infrastructure operations as a software problem: building tooling, automation, platforms and integrations that allow Verrus to deploy and manage software infrastructure with high reliability and scalability.
You will work closely with several teams across the company: software optimization developing the Verrus optimization engine, controls, mechanical and electrical engineering, and product development. Responsibilities
- Infrastructure as Code (IaC) : architect, deploy and maintain robust infrastructure applying a software programming mindset, ensuring environments are testable, reproducible, and scalable.
- Tooling & Automation : write high-quality, maintainable code (primarily in Golang) to build internal developer tools, CLI tools, and automation scripts that reduce toil and streamline deployment pipelines.
- Cluster Management : manage and orchestrate containerized applications using both AWS EKS and HCI (Hyper-converged infrastructure) ensuring optimal resource allocation for workloads.
- Reliability & Observability : design and implement comprehensive monitoring, logging, and alerting systems (primarily using Prometheus, Grafana, etc.) to ensure high availability and visibility enabling rapid incident response.
- CI/CD Optimization : help optimize the CI/CD pipelines to enable seamless, safe, and frequent code deployments across environments.
- System Architecture : collaborate with the broader team to design and implement resilient distributed systems that can avoid Single Points of Failure (SPOFs).
- Mentorship : provide technical guidance and mentorship to mid-level engineers, championing SRE best practices and software engineering principles within the infrastructure domain.
Requirements
- 5+ years of experience in Software Engineering, DevOps, or Site Reliability Engineering.
- Strong proficiency in Golang is required. You should be comfortable writing production-grade software, not just scripting.
- Deep experience with public cloud providers (AWS, GCP) and container orchestration.
- Proven track record of managing infrastructure using code (Pulumi preferred).
- A solid foundation in computer science or software engineering principles, including data structures, algorithms, and system design., * Experience with bare-metal provisioning or hybrid-cloud environments.
- Familiarity with energy data standards and protocols (eg, Modbus TCP, DNP3, MQTT), IoT protocols, or industrial control systems.
- Familiarity with NATS or other publish/subscribe technologies.
- Familiarity with Rust and functional programming basics.
- An understanding of distributed systems and general RPC architectures.
- Ability to efficiently identify and resolve issues using problem-solving and communication skills.
- Adaptability to work in a rapidly changing, fast-paced environment and picking up new technical areas of expertise.
Benefits & conditions
The annual base salary range for this position in California is $150,000-$180,000. Salary is only one part of Verrus' comprehensive compensation package, which also includes a discretionary bonus, equity, general health benefits and paid time off. Compensation is determined by multiple factors, including market location, and may vary based on job-related knowledge, skills, and experience.