Senior DevOps / Infrastructure & AI LLM Systems Engineer (Hybrid)
Role details
Job location
Tech stack
Job description
This is a foundational role. You will be our first dedicated DevOps/Infrastructure Engineer and will take full ownership of everything related to cloud infrastructure, deployments, reliability, and scaling.
Our engineering team is made up of 7 people, operates with intensity, and moves fast. Rapid iteration is one of our biggest advantages. Over the past two years, we've built a tremendous amount, but the surface area ahead of us is even larger as we scale usage, models, and automation. You will play a key role in keeping our platform fast, reliable, and ahead of the curve.
This role goes beyond DevOps. You will also contribute at the LLM layer: running evaluations, experimenting with models, improving latency, optimizing costs, and helping shape how our AI systems operate at scale. If you enjoy working at the intersection of infrastructure, backend systems, and AI, this is exactly the kind of role where you'll thrive.
What You Will Own :
Infrastructure & Platform :
- All cloud infrastructure across AWS, GCP, and Azure.
- Kubernetes cluster management, scaling, upgrades, and security.
- CI/CD pipelines (GitHub Actions) and deployment systems.
- Observability, monitoring, logging, alerting, and reliability practices.
- Incident response, on-call rotation, and uptime improvements.
- Cost optimization and infra-level performance tuning.
- Security best practices, IAM, secrets, policies, and overall infra hygiene.
Backend & Data Systems :
- High-scale PostgreSQL (large DB, indexes, performance tuning).
- Redis and Sidekiq pipelines, queue scaling, job parallelization.
- API performance and throughput.
AI / LLM Systems :
- Manage and optimize LLM deployments across cloud providers.
- Improve latency, reliability, and cost through routing and system architecture.
- Help build and maintain eval pipelines and A/B tests.
- Contribute directly at the app level (prompts, agents, routing).
- Support or prototype self-hosted model experiments (optional but valuable).
Requirements
Do you have experience in Terraform?, Do you have a Master's degree?, You have 8+ years of experience in DevOps / infrastructure roles, ideally in fast-paced SaaS or startup environments. You've scaled production systems before and know how systems behave under real load.
You're equally comfortable deep in Kubernetes or writing Ruby/Python for a quick script, tool or LLM eval. You care about reliability, speed and pragmatism. You enjoy working on AI systems and have hands-on experience with LLM-powered applications.
Your toolkit includes:
- Kubernetes, Docker
- AWS, Azure, GCP (strong in at least 2)
- GitHub Actions CI/CD
- PostgreSQL, Redis, Sidekiq
- LLM APIs (OpenAI, Azure, Anthropic; self-hosted a plus)
- Terraform or similar IaC
- Strong coding ability to contribute across the stack, If you're earlier in your career but have strong infrastructure experience and clear upside, and you can reasonably grow into the full scope within 2 to 3 years, feel free to reach out. Raw talent is welcome, but depth of experience scaling systems is a big plus here.
Benefits & conditions
- High impact with ownership from day one : join a small, international engineering team where every feature you ship and every solution you design is directly visible in production.
- Competitive compensation based on experience and stock options
- Fast growth = fast learning curve : in this hybrid engineering role, you'll quickly gain exposure to AI, product iteration, customer workflows, and cross-functional problem-solving
- Work closely with founders and product/engineering leadership : your ideas and your ownership will directly influence the roadmap.
- A culture of ownership, transparency, and continuous improvement : we move fast, iterate constantly, and empower people to grow
- Flexibility : fully remote in Europe with preference for Barcelona office (Boston office is also an option)