Tech Lead SRE / DevOps: IT Infrastructure Transformation
Digital Performance GmbH
Hamburg, Germany
2 days ago
Role details
Contract type
Permanent contract Employment type
Full-time (> 32 hours) Working hours
Shift work Languages
English, German Experience level
SeniorJob location
Remote
Hamburg, Germany
Tech stack
Amazon Web Services (AWS)
Azure
Bash
Cloud Computing
Configuration Management
Computer Networks
Data Centers
Linux
Github
Python
Linux Servers
Octopus Deploy
Reliability Engineering
Ansible
Prometheus
Runbook
Software Engineering
Google Cloud Platform
Grafana
AWS Lambda
Containerization
Gitlab-ci
Kubernetes
Infrastructure Automation Frameworks
Information Technology
Performance Monitor
Hashicorp
Terraform
Webhooks
Docker
Jenkins
Job description
We are an innovative technology company specializing in the development and marketing of scalable and high-performance community platforms in the online dating sector-an industry that remains stable and continues to grow even in times of crisis.
Join our team and leverage your expertise in Site Reliability Engineering (SRE) and IT transformation to further develop our platforms and deliver innovative solutions for our customers and internal teams. You can expect exciting challenges, the latest technologies, and a strong team that values collaboration, automation, and continuous improvement.
This is what you do in our team
- Automation of Operational Processes
- Development and Implementation of Infrastructure-as-Code (IaC) Solutions (e.g., Terraform, Ansible).
- Automation of Recurring Tasks such as deployments, scaling, and troubleshooting.
- Building and Maintaining CI/CD Pipelines for fast and secure software delivery.
- Transformation of Traditional Linux Systems into Kubernetes
- Migration of Traditional Linux Servers to Kubernetes environments.
- Supporting Software Development Teams in utilizing Kubernetes for their workloads.
- Automation of Network Components in the on-premise data center.
- Secure Secret Management
- Development and Implementation of a Secret Management System (e.g., HashiCorp Vault).
- Integration of Critical IT Infrastructure Components.
- Providing Simple but Secure Access to credentials for applications and users.
- Error Management & Self-Healing Systems
- Implementation of Automated Recovery Mechanisms to minimize system downtime.
- Development of Proactive Monitoring Solutions for early problem detection.
- Utilization of Event-Driven Automation (e.g., Lambda functions, webhooks) for autonomous anomaly response.
- Infrastructure Optimization
- Automated Scaling of Cloud and On-Premise Systems based on load predictions.
- Ensuring Compliance with Security Policies through Compliance-as-Code.
- Resource Optimization via Automated Capacity Planning.
- Incident Response & Troubleshooting
- Development of Runbooks and Playbooks for automated incident management.
- Utilization of AIOps for log and metric analysis to enable early fault detection.
- Collaboration with Developers and Operations Teams to accelerate issue resolution through automation
- Leadership
- You'll contribute to the leadership of the team, especially when it comes to new technologies and agile challenges
Requirements
- At least 5 years of professional experience in Site Reliability Engineering, DevOps, or Software Development with a focus on Automation & Configuration Management.
- Ideally, you have experience as a Tech Lead or Team Lead
- Language skills: Very good proficiency in German (fluent in spoken and written form) and/or English (fluent in spoken and written form).
- This role requires on-site presence in our Hamburg office two days a week for team meetings. We support flexible work models while ensuring effective collaboration within the team
- Strong expertise in Infrastructure as Code (IaC) and Configuration Management (e.g., Terraform, Ansible).
- Proficiency in CI/CD pipelines and automation tools (e.g., Jenkins, GitHub Actions, GitLab CI, ArgoCD).
- Experience with monitoring and logging solutions (e.g., Prometheus, Grafana).
- Knowledge of container technologies & orchestration (e.g., Kubernetes, Docker).
- Experience with cloud platforms (AWS, Azure, Google Cloud).
- Good programming and scripting skills (Python, Bash).
- Problem-solving mindset and strong teamwork skills.
Benefits & conditions
- Permanent employment contract
- Flexible working hours and mobile working
- Individual training and development opportunities
- Subsidy for an Urban Sports Club membership (M or L)
- Three weekly fitness sessions with a professional trainer
- Regular team events to foster a strong team spirit
- Use our JobRad leasing model and ride your dream bike for both work and personal use
- Option to work remotely from abroad for a limited time in coordination with your team
- Subsidy for the 63-euro ticket ("Deutschlandticket")