Site Reliability Engineer
Role details
Job location
Tech stack
Job description
Do you love automation and enjoy solving technical problems with scalable and well-managed solutions in a constantly changing environment? Old Mission Capital is seeking a well-rounded technologist with core strengths in Linux and network administration. This Site Reliability Engineer will be responsible for owning and managing the deployment, maintenance, and enhancement of our servers. This Site Reliability Engineer will also be responsible for the upkeep, configuration, and reliable operation of computer systems that host proprietary trading, third-party, and open-source applications. The main goal for this role is to develop tools and systems to automate and optimize our operational workflow., * Release management and performance engineering
- Automate processes using Shell and Python scripting
- Deploy, run and monitor all applications
- Coordinate, prioritize, and plan the necessary changes
- Effectively collaborating with our traders, developers, and other teams to meet the Firm's technology needs
- Maintain written documentation, provide timely updates on support tickets
- Participate in an on-call rotation, 2 shifts a month
Requirements
Do you have experience in Task prioritization?, Do you have a Bachelor's degree?, * An undergraduate or an advanced degree in a quantitative field such as Computer Science, Engineering or one of the hard sciences
- 4+ years of Python, Perl, Golang, C++, or Rust development experience
- Must be comfortable working in an Enterprise Red Hat Linux environment
- Must have configuration management experience with Ansible (preferred) or Salt/Chef/Puppet
- Strong experience with containerization and orchestration: Podman, Docker, Kubernetes, Rancher, and/or Hashicorp Nomad
- Familiarity with cloud infrastructure, distributed systems, or HPC clusters would be extremely useful
- Exposure to Redfish or other BMC API would be extremely useful
- Strong analytical and problem-solving skills
- Familiarity with automated CI/CD pipelines for large-scale systems management a strong plus
- Familiarity with the full monitoring stack (tracing, metrics, and logging) in the Grafana/Prometheus ecosystems a plus
- Great communication skills, both written and verbal
- Ability to prioritize and multi-task in a fast-paced environment
- Occasional after-hours and weekend support (including travel) may be required for system upgrades or emergencies
Benefits & conditions
Pulled from the full job description
- Tuition reimbursement
- Food provided
- Health insurance
- 401(k) matching
- Vision insurance
- Dental insurance
- Life insurance, * Fully paid Medical, Dental, Vision, Disability, and Life Insurance
- Fully stocked kitchen; free breakfast and lunch every day on-site
- Tuition Reimbursement Program
- 401(k) with employer match
- Paid Vacation, Sick, and Parental leaves
- Commuter and Flexible Spending Programs
- In office Monday-Friday with 10 remote days per year
Base Salary Range
$175,000 - $225,000 - Salaries are based on numerous factors such as skills, experience, and education. Our compensation package also includes a discretionary bonus and a comprehensive benefits program for full-time employees. For more information, reach out to your recruiter.