Site Reliability Engineer
Role details
Job location
Tech stack
Job description
Crytek is looking for an experienced Site Reliability Engineer to support Hunt: Showdown's NetOps department in our Frankfurt Studio. The person in this position will serve as liaison between different development teams and the network operations team. They will work closely with the production team and system architect to ensure the projects are well planned, documented and implemented. This position will have operational and project duties. This role is based on site in our headquarters in Frankfurt, Germany. You will have the chance to work alongside our world class team that features some of the best developers in the industry and take advantage of our attractive relocation package.
Responsibilities
- Daily operation and maintenance of a hosted/cloud data-center environment.
- Installation, configuration and patching of system and game software.
- Daily monitoring, management, and reporting tasks, as well as contributing to 99.9% application and system up-time.
- Work closely with R&D to install, configure and operate developed features, as appropriate.
- Ensure appropriate, accurate, up-to-date technical documentation is available for systems and game logic, and environment.
- Assist with process, policy development and documentation.
- Operate and maintain a dynamic, highly available environment.
Requirements
Do you have experience in Writing skills?, * 3+ years of experience as a Network Operations Engineer or similar.
- Proficient in container technologies.
- Experience with continuous integration, delivery and automated deployments.
- Proficient in network security.
- Experience working with bare metal as well as cloud servers.
- Proficient in working with automation tools like Ansible and Terraform.
- Proficient in observability tooling like Open Telemetry, Prometheus, Mimir and Grafana.
- Experience managing all aspects of high-traffic servers, including scaling, profiling, debugging, and stress testing.
- Solid understanding of the full web technology stack (e.g. REST API, HTTP, CDN, cookies, headers, asset loading / caching).
- Solid understanding and hands-on experience in setting-up monitoring for online applications (Go, Java, C++) and creating metrics and alerts in order to act proactively.
- Experience with Automation, preferably Ansible, Terraform.
- Skilled with Shell, Python scripting.
- Skilled in network troubleshooting.
- Great English communication and writing skills.
- Willing to relocate to Frankfurt.