Site Reliability Engineer
Role details
Job location
Tech stack
Job description
Join to apply for the Site Reliability Engineer role at Capchase. Capchase provides flexible payment solutions to B2B software, cloud, and AI companies. Our core product, Capchase Pay, offers a buy-now-pay-later option for B2B SaaS, hardware, and cloud purchases, helping companies sell more and accelerate cash collection. Founded in 2020 and headquartered in NYC, Capchase has funded over $2.5B to thousands of companies, operating in eight countries across North America and Europe. We have raised over $1B from top fintech investors and credit partners, including QED (Nubank, Klarna), 01 Advisors (Tipalti, MasterClass), Bling Capital (Airtable, GitLab, Lyft, Square), SciFi (Stripe, Brex), and Caffeinated (Opendoor, Airtable). Our Achievements include: * Supporting thousands of software companies and buyers * Having 80 Capchasers from 15+ nationalities * Operating in 8 markets * Achieving Top Decile Growth * Ranking #2 in BNPL, #1 in B2B In December 2024, we reached the top in G2's Installment Payment and BNPL categories, surpassing 2,000 vendors and buyers using Capchase Pay. Why work with us? Help accelerate an industry: At Capchase, we're creating a new category, making every day different. Join a diverse team of 15+ nationalities passionate about helping innovative companies thrive. Be part of our growth! As a key member of our engineering team, you'll influence our culture, processes, and infrastructure as we scale 50x. This role offers the chance to work in a fast-paced environment, collaborate with talented engineers, and build a resilient, high-performance product. You will be responsible for ensuring the availability, latency, performance, efficiency, scalability, and reliability of our systems, while shaping the long-term vision for Site Reliability Engineering at Capchase. What will you do? * Infrastructure & Scalability: Design and evolve our architecture to scale 50x, lead scalability initiatives, and partner with leadership on critical systems
Requirements
strategy. * Reliability & Performance: Own SLAs/SLOs/SLIs, conduct capacity planning, and standardize observability practices. * Monitoring & Observability: Define monitoring requirements, implement tools for trend analysis, anomaly detection, and system health visualization. * CI/CD & Developer Velocity: Maintain pipelines, automate processes, and enhance deployment and testing workflows. * Incident Management & Disaster Recovery: Lead on-call rotations, manage incidents, and drive postmortem actions. * Security & Compliance: Collaborate on security practices and ensure compliance to support scalability and trust. * Team & Culture Building: Define SRE roadmap, participate in hiring, mentoring, and foster a culture of ownership and continuous improvement. What are we looking for? * Bachelor's in Computer Science or related field, or equivalent experience * Proficiency in programming languages like C++, Elixir, JavaScript, Python, or Go * Strong understanding of algorithms, data structures, and troubleshooting distributed systems * Hands-on experience with Kubernetes, Terraform, and GCP * Excellent debugging, automation, and operational skills * Effective communicator with a systematic problem-solving approach This role is ideal for engineers passionate about building scalable, high-performing systems and collaborating across teams to shape engineering reliability. We are an equal opportunity employer valuing diversity and do not discriminate based on race, religion, gender, age, or disability. Additional Details * Seniority level: Mid-Senior level * Employment type: Full-time * Job function: Engineering and IT #J-18808-Ljbffr