Site Reliability Engineer with Node.js

Jobs Europe AB

2 months ago

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Job location

Tech stack

Amazon Web Services (AWS)

Application Performance Management

Azure

Bash

Cloud Computing

Databases

Distributed Systems

Python

PostgreSQL

MongoDB

Node.js

Redis

Reliability Engineering

Prometheus

Software Engineering

Datadog

Data Logging

Scripting (Bash/Python/Go/Ruby)

Google Cloud Platform

Grafana

Containerization

Kubernetes

Operational Systems

Docker

Microservices

Requirements

About the role As an SRE, you will be instrumental in the evolution and hardening of a platform which serves millions of unique daily visitors. Your key responsibilities will include leading initiatives in platform observability and security. You will collaborate closely with development teams to identify and resolve bottlenecks, especially within Node.js-based applications. Your unique blend of software development and infrastructure knowledge will be vital in enhancing the development experience and environments. Responsibilities Design, implement, and maintain scalable and highly available systems. Develop and improve monitoring, alerting, and logging solutions. Automate operational tasks and processes to reduce manual intervention. Collaborate with development teams to ensure reliability is built into the software development lifecycle. Troubleshoot and resolve complex production issues across various services. Participate in on-call rotations to support our critical systems. Implement and manage CI/CD pipelines. Contribute to the continuous improvement of our infrastructure and tooling. Optimize Node.js application performance and resource utilization. Advocate for best practices in reliability, performance, and security. Minimum Qualifications Strong proficiency in Node.js development and understanding of its ecosystem. Strong experience with containerization technologies (e.g., Docker, Kubernetes). Solid understanding of networking concepts, operating systems, and distributed systems. Experience with monitoring and alerting tools (e.g., Prometheus, Grafana, Datadog). Experience with cloud platforms (e.g., AWS, GCP, Azure). Excellent problem-solving and communication skills. Preferred Qualifications Proficiency in scripting languages (e.g., Python, Bash). Knowledge of microservices architecture. Experience with database technologies (e.g., PostgreSQL, MongoDB, Redis). Experience with on-call and production incident management. Understanding of security fundam