SRE / Platform Engineer

Peec AI

Berlin, Germany

5 months ago

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Experience level

Senior

Compensation

€ 150K

Job location

Berlin, Germany

Tech stack

Artificial Intelligence

Amazon Web Services (AWS)

Azure

Cloud Computing Security

Computer Programming

Databases

Continuous Integration

Distributed Systems

Python

Reliability Engineering

Prometheus

TypeScript

Datadog

Pulumi

Scripting (Bash/Python/Go/Ruby)

Grafana

Cloudformation

Containerization

Kubernetes

Sentry

Terraform

Pagerduty

Job description

Own the reliability, scalability, and performance of Peec AI's core systems and infrastructure
Design, build, and maintain the tooling, automation, and monitoring that keep our services fast, secure, and highly available
Partner closely with product and engineering teams to ensure new features are reliable, observable, and easy to operate from day one
Develop and refine incident response practices, ensuring issues are triaged quickly and resolved with minimal user impact
Proactively identify and address bottlenecks, single points of failure, and operational inefficiencies across the stack
Champion operational excellence and a culture of reliability, driving best practices across the engineering organization

Requirements

Do you have experience in TypeScript?, * 5+ years of experience in Site Reliability Engineering, Infrastructure Engineering, or similar roles supporting production systems at scale

Deep expertise with Infrastructure as Code tools (Terraform, Pulumi, CloudFormation, etc.)
Strong experience with observability platforms (e.g., Datadog, Sentry, Prometheus, Grafana) and incident response tooling (PagerDuty, Incident.io, or similar)
Proven proficiency with major cloud platforms (GCP, AWS, or Azure) and modern distributed systems
Strong programming and scripting skills (e.g., TypeScript and Python) for automation and tooling
A track record of diagnosing complex system problems and implementing robust, long-term solutions
Solid understanding of CI/CD, Kubernetes, containerization, networking, databases, and cloud security principles
Excellent problem-solving skills, attention to detail, and a strong commitment to operational excellence, * Experience supporting AI/ML workloads or data-intensive systems
Prior SRE experience in a high-growth startup or globally distributed infrastructure environment
Familiarity with zero-downtime migrations, multi-region architectures, or compliance frameworks

Benefits & conditions

Exciting and challenging work with real impact and ownership at one of Europe's fastest-growing Series A startups
Regular team events and off-sites
Aggressive equity compensation package
Paid Uber Eats & Uber home when working late
The most beautiful office space and work environment in Berlin

Compensation Range: €100K - €150K If you require alternative methods of application or screening, you must approach the employer directly to request this as Indeed is not responsible for the employer's application process.