Senior DevOps & Solutions Architect
Role details
Job location
Tech stack
Job description
As a Senior DevOps & Solutions Architect, you will be a key leader in designing, building, and scaling the core infrastructure that powers our agentic environment. This role involves architecting the future of our enterprise-grade AI solutions and providing strategic technical leadership to ensure our platform is robust, scalable, and secure.
InteractiveAI operates with two high-performance engines: product teams developing our Agentic IDE and implementation squads delivering domain-specific AI solutions. Depending on your expertise and ambition, you'll join the team where you can create significant value, with a transparent, performance-based path to growth and rewards.
- Architect and scale multi-tenant, cloud-agnostic runtimes (Kubernetes / GPU clusters) supporting on-premises VPC and hybrid setups.
- Design and implement secure end-to-end CI / CD pipelines for complex ML workflows, from data ingestion and fine-tuning (LoRA / QLoRA) to high-stakes deployment.
- Partner with product and client performance teams to accelerate custom agent deployment from sandbox (5 days) to production (46 weeks), meeting tight SLAs.
- Lead adoption of infrastructure-as-code practices using tools like Terraform, Ansible, or similar.
- Define and manage strategies for containerized workloads (Docker, Kubernetes) to optimize performance, cost, and reliability.
- Establish and enforce security compliance and data governance standards, especially for enterprise clients.
- Mentor junior engineers and provide strategic guidance on infrastructure design, incident response, and system reliability.
Requirements
A seasoned architect capable of leading the design and implementation of a robust, scalable infrastructure for our agentic platform and ecosystem. You should have a proven track record of architectural leadership, strong fundamentals, and operational maturity., * At least 5 years of experience in DevOps, Site Reliability, or Infrastructure Engineering roles, with a minimum of 2 years in a solutions or systems architect capacity.
- Proven experience deploying and managing complex AI / ML workloads on major public clouds (AWS, GCP, Azure).
- Extensive experience designing, deploying, and managing resilient, distributed cloud solutions at scale.
- Deep expertise in containerization and orchestration (Docker, Kubernetes).
- Strong background in building and managing advanced CI / CD pipelines for software and ML lifecycles.
- Proficiency with infrastructure-as-code tools (Terraform, CloudFormation, Pulumi).
- Strong scripting and automation skills (Python, Bash, or similar).
- Experience with monitoring and logging stacks (Prometheus, Grafana, ELK).
- Excellent communication and collaboration skills, with the ability to lead and influence cross-functional teams.
Additional Requirements
- Experience with ML / AI infrastructure and MLOps tools (MLflow, Weights & Biases).
- Experience implementing security and compliance frameworks (GDPR, ISO 27001) in regulated environments.
- Experience in enterprise-grade or highly regulated industries is a plus., * Proactive and resourceful: You identify gaps and drive solutions independently.
- Accountable and high-ownership: You treat our infrastructure as your own and honor commitments.
- Entrepreneurial mindset: You thrive in ambiguity, embrace change, and deliver in fast-paced environments.
- Architectural leader: You translate business needs into technical solutions and provide clear architectural vision.
- Team player: You collaborate effectively, give and receive feedback, and mentor others.
Benefits & conditions
- Competitive salary and performance bonuses.
- Potential for future equity for high performers.
- Health and wellness allowances.
- Private health insurance.
- Flexible work arrangements, travel when needed (preferably hybrid in Lisbon or Madrid).
- 25 days of paid time off, excluding public holidays.