Lead Senior Backend Engineer

T-systems
Municipality of Madrid, Spain
3 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Shift work
Languages
English
Experience level
Senior

Job location

Municipality of Madrid, Spain

Tech stack

Java
API
Artificial Intelligence
Amazon Web Services (AWS)
Application Integration Architecture
Azure
C++
Cloud Computing
Software Quality
Code Review
Continuous Integration
Database Design
DevOps
Distributed Systems
Django
Fault Tolerance
Interoperability
Python
Key Management
Load Testing
OAuth
Queueing Systems
Role-Based Access Control
Prometheus
JSON Web Token
Software Engineering
SQL Databases
Web Application Frameworks
WebSocket
Data Logging
Data Processing
Real Time Systems
Flask
Delivery Pipeline
Large Language Models
Grafana
Concurrency
Generative AI
Backend
FastAPI
Event Driven Architecture
Build Management
Containerization
Kubernetes
Information Technology
Low Latency
Build Tools
Machine Learning Operations
Front End Software Development
TensorRT
Api Design
Api Gateway
REST
gRPC
Webhooks
Data Pipelines
Dynatrace
Api Management
Docker
Microservices

Job description

creativity, and ownership. Role Overview We are looking for a senior backend engineer highly skilled in Python development with strong system design skills and extensive experience building production infrastructure. You will own and architect the backend platform behind our AI inference endpoints-a multi-tenant system handling authentication, API key management, usage metering, billing, and request proxy services. You'll shape architecture decisions, raise engineering quality, and build systems that are performant, secure, and observable at scale. Responsibilities Design and build core platform services: API gateway, authentication, authorization, key rotation, and multi-tenant isolation. Implement and optimize APIs and backend systems using Python frameworks such as FastAPI, Flask, or Django. Architect usage metering, billing integration, and rate limiting for inference endpoints. Build scalable, fault-tolerant microservices for data processing and AI integration. Build and operate a

Requirements

high-throughput proxy/routing layer for AI model serving traffic. Collaborate with cross-functional teams to design system architecture and ensure seamless system interoperability. Design telemetry and observability from the ground up-structured logging, distributed tracing, metrics, and alerting. Implement robust CI/CD pipelines, monitoring, and observability for high-performance production systems. Drive technical decisions on architecture, data modeling, and technology choices. Identify performance bottlenecks and drive improvements in reliability, scalability, and latency. Establish engineering standards for the backend codebase: testing, code review, CI/CD, and deployment practices. Ensure best practices for security, compliance, and maintainability across the API lifecycle. Collaborate closely with the ML infrastructure team to integrate with model serving systems such as NVIDIA Dynamo, vLLM, or TensorRT-LLM. Qualifications Degree in Computer Science, Software Engineering, or equivalent professional experience. 5+ years of experience building and operating backend systems in production, focusing on API design and high-scale environments. Strong proficiency in Python and at least one systems-oriented language (Go, Rust, Java, or C++). Experience designing and maintaining REST APIs and related protocols (gRPC, WebSockets, SSE, Webhooks). Solid understanding of distributed systems: consistency, fault tolerance, concurrency, performance, networking, data pipelines, and optimization. Experience designing authentication/authorization systems (OAuth2, JWT, API key management, RBAC/ABAC). Experience with cloud infrastructure (AWS, GCP, or Azure) and containerized environments (Docker) with orchestration (Kubernetes). Understanding of database design (SQL, ORMs) and data modeling at scale. Fluent in English (written and verbal) and comfortable collaborating in international teams. Nice-to-have Requirements Experience with high-throughput, low-latency or real-time systems, such as message queues or event-driven architectures. Familiarity with generative AI or large language models, including integration of AI APIs, RAG workflows, or vector databases. Experience with modern DevOps practices, infrastructure as code, GitOps workflows, and observability tools (Prometheus, Grafana, OpenTelemetry). Exposure to ML serving infrastructure (model servers, GPU scheduling, inference optimization) a plus, but not required. Occasional frontend development when necessary. Experience in stress/load testing and performance evaluation of production systems. Strong problem-solving mindset and commitment to code quality and performance excellence. Tra

About the company

Company Description T-Systems is part of the Deutsche Telekom Group, with around 30,000 employees worldwide. We create technology with purpose to generate a positive impact on society and we are looking for curious talent eager to learn, take on challenges, and contribute ideas that transform our customers' experience. We trust people: we offer autonomy, continuous support, and a collaborative environment where you can grow without limits. We are one global team guided by respect, integrity, and a passion for doing better every day. Job Description T-Systems delivers advanced technology solutions across industries such as automotive, healthcare, and public services. Our AI Foundation Services team builds the platform infrastructure that powers AI inference at scale, including API gateways, authentication, billing, and multi-tenant serving. We design and build high-performance backend systems and APIs that power intelligent applications. The culture encourages deep technical expertise

Apply for this position