Platform Engineer - AI Platform team
Role details
Job location
Tech stack
Job description
The Internal Platform is a pivotal foundation that accelerates product development by providing a reliable, scalable, and self-service ecosystem. It supports the entire software lifecycle and is meticulously tailored to meet the organization's technological needs and strategic direction.
The platform enables development teams to operate autonomously in 80% of cases, reducing dependency on the Internal Platform area, and ensuring that compliance, security, and business continuity are integrated across the entire platform - defending general reliability of services, and data integrity.
Overall, a platform engineering team plays a critical role in ensuring a company's technology infrastructure is reliable and scalable.
The Internal Platform area consists of 34 people including 27 individual contributors 2 Staff Engineers, 4 Engineering Managers and Head of Internal Platform. The team is organized in a way to efficiently address Stakeholders needs.
The AI Platform Engineers team is a part of the Internal Platform area and nowadays consists of 3 Platform Engineer, Software Engineer, and an Engineering Manager. The team is accountable for designing, building and maintaining the company AI infrastructure. Ensuring high availability, scalability, security and cost efficiency. As AI needs are growing, we are intensely growing that team to be "The Magnificent Seven".
What are the challenges in the team?
- The complexity of the Docplanner organization: Docplanner is a complex organization with multiple teams working on various products and services. One of the main challenges for the platform engineering teams is to understand and integrate the diverse technology stacks used by different teams.
- Scalability and reliability of systems: As Docplanner grows and expands, the demand for the technology infrastructure also increases. The platform engineering team must ensure that the systems are designed to handle high traffic, are scalable, and secure
Who will you work closely with?
You will work closely with Product Teams to understand their technical requirements and provide them with the necessary tools, services, and infrastructure to support their development efforts. Your insights and expertise will contribute to enhancing the efficiency and effectiveness of their workflows.
How would you be impacting our mission?
- Design, implement, and manage AI Platform architecture.
- Control AI-related costs, including models, GPUs, and other resources.
- Work closely with Product teams to provide technical expertise and propose innovative solutions.
- Guarantee highly available AI services through best practices and automation.
- Collaborate with ML teams to operationalize AI models and integrate them into systems.
- Troubleshoot critical issues and continuously optimize system performance.
- Provide day-to-day support to team members and contribute to knowledge sharing.
Requirements
- Strong experience with Kubernetes (must-have).
- Knowledge of Terraform, Crossplane (nice to have) and Helm charts.
- Experience with CI/CD tools like GH Actions, Argo CD and Argo Rollouts.
- Familiarity with tools like Karpenter, KEDA, Velero and Cilium.
- Experience building secure, scalable, and high-availability environments on AWS.
- Understanding of Disaster Recovery Planning and strategies for cloud infrastructure.
- Cloud AI offering: familiar with AWS Bedrock or Azure OpenAI
- Experience with Python apps at scale.
- Experience working with GPUs and distributing workloads.
- Be prepared to work in a startup-like environment, where priorities can shift quickly, tasks evolve rapidly, and flexibility is key to success in our fast-paced setting.
- Comfortable scripting or developing tools in Bash or Go.
- You can communicate in English (both spoken and written - min. B2 level).
- Growth mindset: nobody ticks all those boxes above, but willingness to learn is strongly valued here.
Benefits & conditions
- Technical Interview - 1 hour
- Home assignment
- Assignment presentation + Coffee with the team - 2 hours (1 hour for each)
Each step is designed to help both you and our team understand whether this is a strong mutual fit. We'll guide you through each stage so you always know what's coming
Let's talk money
- A salary adequate to your experience and skills.
- Flexible remuneration and benefits system via Flexoh, which includes: restaurant card, transportation card, kindergarten, and training tax savings.
True flexibility and work-life balance
- Remote or hybrid work model with our hub in Barcelona.
- Flexible working hours (fully flexible, as in most cases you only have to be on a couple of meetings weekly).
- Summer intensive schedule during July and August (work 7 hours, finish earlier).
- 23 paid holidays, with exchangeable local bank holidays.
- Additional paid holiday on your birthday or work anniversary (you choose what you want to celebrate).
Health comes first
- Private healthcare plan with Adeslas for you and subsidized for your family (medical and dental).
- Access to hundreds of gyms for a symbolic fee in partnership for you and your family with Andjoy.
- Access to iFeel, a technological platform for mental wellness offering online psychological support and counseling.