SRE / Platform Engineer

Peec AI
1 month ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior
Compensation
€ 150K

Job location

Tech stack

Artificial Intelligence
Amazon Web Services (AWS)
Azure
Cloud Computing Security
Computer Programming
Databases
Continuous Integration
Distributed Systems
Python
Reliability Engineering
Prometheus
TypeScript
Datadog
Pulumi
Scripting (Bash/Python/Go/Ruby)
Grafana
Cloudformation
Containerization
Kubernetes
Sentry
Terraform
Pagerduty

Job description

  • Own the reliability, scalability, and performance of Peec AI's core systems and infrastructure
  • Design, build, and maintain the tooling, automation, and monitoring that keep our services fast, secure, and highly available
  • Partner closely with product and engineering teams to ensure new features are reliable, observable, and easy to operate from day one
  • Develop and refine incident response practices, ensuring issues are triaged quickly and resolved with minimal user impact
  • Proactively identify and address bottlenecks, single points of failure, and operational inefficiencies across the stack
  • Champion operational excellence and a culture of reliability, driving best practices across the engineering organization

Requirements

Do you have experience in TypeScript?, * 5+ years of experience in Site Reliability Engineering, Infrastructure Engineering, or similar roles supporting production systems at scale

  • Deep expertise with Infrastructure as Code tools (Terraform, Pulumi, CloudFormation, etc.)
  • Strong experience with observability platforms (e.g., Datadog, Sentry, Prometheus, Grafana) and incident response tooling (PagerDuty, Incident.io, or similar)
  • Proven proficiency with major cloud platforms (GCP, AWS, or Azure) and modern distributed systems
  • Strong programming and scripting skills (e.g., TypeScript and Python) for automation and tooling
  • A track record of diagnosing complex system problems and implementing robust, long-term solutions
  • Solid understanding of CI/CD, Kubernetes, containerization, networking, databases, and cloud security principles
  • Excellent problem-solving skills, attention to detail, and a strong commitment to operational excellence, * Experience supporting AI/ML workloads or data-intensive systems
  • Prior SRE experience in a high-growth startup or globally distributed infrastructure environment
  • Familiarity with zero-downtime migrations, multi-region architectures, or compliance frameworks

Benefits & conditions

  • Exciting and challenging work with real impact and ownership at one of Europe's fastest-growing Series A startups
  • Regular team events and off-sites
  • Aggressive equity compensation package
  • Paid Uber Eats & Uber home when working late
  • The most beautiful office space and work environment in Berlin

Compensation Range: €100K - €150K     If you require alternative methods of application or screening, you must approach the employer directly to request this as Indeed is not responsible for the employer's application process.

Apply for this position