Cloud Infrastructure & Agentic Architect
Role details
Job location
Tech stack
Job description
We are looking for a Cloud Infrastructure & Agentic Architect (f/m/d) to own the technical foundation of our cloud platform. You will maintain our service catalog, design architectural blueprints, establish the observability standard, and - most distinctively - bring hands-on LLM-driven tooling skills to a team actively shaping what the next generation of cloud operations looks like., * Build and maintain a minimalistic, opinionated service catalog of approved cloud components across Google Cloud and Azure
-
Apply a serverless-first, PaaS-first philosophy - challenge complexity and push back on unnecessary infrastructure sprawl
-
For every approved catalog entry, deliver production-ready configuration: IaC, security baseline,observability hooks, and runbook
-
Scrutinise every new component request and justify its addition in terms of cost, operational overhead, and platform alignment
Cloud Architecture
- Define, document, and govern cloud architectural blueprints across networking, compute, storage and data layers
- Design event-driven pipelines that trigger AI-assisted validation, drift detection, and deployment gates
- Serve as the technical authority on platform design decisions within your domain
LLM-Driven Operations
- Implement LLM-based automated deployment capabilities using tools such as OpenCode, Claude Code, or equivalent frameworks
- Design and operate infrastructure workflows augmented by AI agents - from deployment validation to configuration drift detection
- Stay ahead of the market in AI-assisted tooling and bring relevant innovations into the platform
Observability
- Define the observability standard: structured logging, distributed tracing, alerting, dashboards, and SLO/SLA frameworks
- Establish platform-level KPIs and ensure consistent adoption across engineering teams
Collaboration
- Partner with DevOps and Engineering teams to embed platform standards into delivery pipelines
- Partner with SecOps to integrate all controls and security requirements
- Document everything: blueprints, ADRs, runbooks, onboarding guides, Month 3: Service catalog baseline documented; at least 10 approved components with full IaC, security baseline, and runbooks in place Month 6: Observability standard adopted by 8+ engineering teams; LLM-assisted deployment workflows running in production Month 12: Platform architecture ADRs covering all major components; FinOps baseline established; catalog governance process self-sustaining
You're probably NOT a fit if
- You need a large team to delegate to before making decisions
- You prefer deep specialisation in a single cloud over breadth across two
- You're not actively following the LLM/AI tooling space
- You're looking for a role with clearly defined scope and minimal ambiguity
What we offer
- Multicultural team across 7 countries (Germany, France, UK, UAE, Spain, New Zealand, Australia)
- Hybrid-first working
- Continuous learning & certification budget (GCP / Azure certifications fully sponsored)
- Open, inclusive, and high-ownership culture
Requirements
You have operated multi-cloud Kubernetes environments in production, built IaC pipelines with Terraform and have hands-on experience with LLM-driven tooling in real engineering workflows. Specifically:
-
Proven cloud networking experience: firewalls, gateways, VPCs, VPNs, private service connect, and network security groups
-
Strong, current proficiency across Google Cloud Platform (GKE, Cloud Run, Cloud Functions, VPC, Cloud Armor, Pub/Sub, Artifact Registry) and Microsoft Azure
-
Proven IaC experience with Terraform (required); Ansible a strong plus
-
Proven hands-on LLM tooling experience: must have used OpenCode, Claude Code, or equivalent AI-assisted coding/deployment agents in real engineering workflows
-
Kubernetes at scale: demonstrated experience operating AKS (Azure Kubernetes Service) and GKE (Google Kubernetes Engine) - cluster lifecycle, upgrades, networking, and workload management
-
Experience with Vertex AI, Azure AI Foundry, or similar cloud model gardens Comfortable working in ambiguity, driving clarity from first principles without heavy process support
-
Experience in Agile/fast-iteration environments with high individual ownership
Nice-to-Have (Bonus Skills)
-
Knowledge of C5, SecNumCloud, ISO 27001, or equivalent cloud security frameworks FinOps - cost attribution, rightsizing, commitment strategies
-
Prior experience in a SaaS product company at scale
-
Platform engineering / SRE background
Soft Skills
- Intense technical curiosity - you follow the market, test new tools, and form opinions before they become mainstream
- Strong problem-solving discipline: you reach for first principles, not familiarity
- Ownership mindset - you build it, you run it, you improve it
- Clear written and verbal communication: able to document decisions and explain them to non specialists
- Operates well under pressure and tight constraints - energised by hard problems
- Strong negotiation skills - fact-driven and data-driven: you argue with evidence, not opinion
- Highly collaborative: you build alignment across engineering, SecOps, and leadership
Benefits & conditions
By combining information management expertise and in-depth knowledge of the building, infrastructure, and energy industries, Thinkproject empowers customers to efficiently deliver, operate, regenerate, and dispose of their built assets across their entire lifecycle through a Connected Data Ecosystem.
About the company
thinkproject was founded in 2000 in Munich, Germany. Since then, the company has grown into the leading provider for cross-enterprise collaboration and information management in Europe.
Global customers from the construction and engineering industries are served from thinkproject’s home base in Munich and via a range of subsidiaries across Europe.
thinkproject addresses today’s digitization challenges in construction and engineering by providing state-of-the-art software solutions as well as industry expert consulting and services.