Lead Senior Backend Engineer
Role details
Job location
Tech stack
Job description
creativity, and ownership. Role Overview We are looking for a senior backend engineer highly skilled in Python development with strong system design skills and extensive experience building production infrastructure. You will own and architect the backend platform behind our AI inference endpoints-a multi-tenant system handling authentication, API key management, usage metering, billing, and request proxy services. You'll shape architecture decisions, raise engineering quality, and build systems that are performant, secure, and observable at scale. Responsibilities Design and build core platform services: API gateway, authentication, authorization, key rotation, and multi-tenant isolation. Implement and optimize APIs and backend systems using Python frameworks such as FastAPI, Flask, or Django. Architect usage metering, billing integration, and rate limiting for inference endpoints. Build scalable, fault-tolerant microservices for data processing and AI integration. Build and operate a
Requirements
high-throughput proxy/routing layer for AI model serving traffic. Collaborate with cross-functional teams to design system architecture and ensure seamless system interoperability. Design telemetry and observability from the ground up-structured logging, distributed tracing, metrics, and alerting. Implement robust CI/CD pipelines, monitoring, and observability for high-performance production systems. Drive technical decisions on architecture, data modeling, and technology choices. Identify performance bottlenecks and drive improvements in reliability, scalability, and latency. Establish engineering standards for the backend codebase: testing, code review, CI/CD, and deployment practices. Ensure best practices for security, compliance, and maintainability across the API lifecycle. Collaborate closely with the ML infrastructure team to integrate with model serving systems such as NVIDIA Dynamo, vLLM, or TensorRT-LLM. Qualifications Degree in Computer Science, Software Engineering, or equivalent professional experience. 5+ years of experience building and operating backend systems in production, focusing on API design and high-scale environments. Strong proficiency in Python and at least one systems-oriented language (Go, Rust, Java, or C++). Experience designing and maintaining REST APIs and related protocols (gRPC, WebSockets, SSE, Webhooks). Solid understanding of distributed systems: consistency, fault tolerance, concurrency, performance, networking, data pipelines, and optimization. Experience designing authentication/authorization systems (OAuth2, JWT, API key management, RBAC/ABAC). Experience with cloud infrastructure (AWS, GCP, or Azure) and containerized environments (Docker) with orchestration (Kubernetes). Understanding of database design (SQL, ORMs) and data modeling at scale. Fluent in English (written and verbal) and comfortable collaborating in international teams. Nice-to-have Requirements Experience with high-throughput, low-latency or real-time systems, such as message queues or event-driven architectures. Familiarity with generative AI or large language models, including integration of AI APIs, RAG workflows, or vector databases. Experience with modern DevOps practices, infrastructure as code, GitOps workflows, and observability tools (Prometheus, Grafana, OpenTelemetry). Exposure to ML serving infrastructure (model servers, GPU scheduling, inference optimization) a plus, but not required. Occasional frontend development when necessary. Experience in stress/load testing and performance evaluation of production systems. Strong problem-solving mindset and commitment to code quality and performance excellence. Tra