Site Reliability Engineer (SRE)

Sky Solutions LLC.
7 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior

Job location

Remote

Tech stack

Amazon Web Services (AWS)
Application Performance Management
Automation of Tests
Cloud Computing
Code Review
DevOps
Monitoring of Systems
Python
PostgreSQL
Load Testing
Performance Tuning
Reliability Engineering
Database Optimization
Backend
Cloudformation
FastAPI
Containerization
Kubernetes
Deployment Automation
Terraform
Docker

Requirements

We are looking for an engineer to improve the reliability, performance, and scalability of our systems. This role combines DevOps, backend performance tuning, and Site Reliability Engineering (SRE), with a strong focus on Python applications, AWS infrastructure, and PostgreSQL databases., 1. Python performance optimization

  1. AWS services
  2. PostgreSQL and database tuning
  3. Familiarity with observability and monitoring tools
  4. DevOps and SRE best practices
  5. Experience with automated testing and code review

Key Responsibilities

Build and maintain cloud infrastructure on AWS

Set up and manage CI/CD pipelines for automated deployments

Monitor application performance using tools like OpenTelemetry

Analyze and optimize Python application performance

Diagnose and tune PostgreSQL database queries and configurations

Define and enforce performance and reliability metrics (SLOs)

Act as a gatekeeper for deployments-block releases if metrics are not met

Investigate production issues and implement long-term fixes

Review and test code using tools like Claude and other methods

Experience with containerization (Docker, Kubernetes)

Knowledge of infrastructure as code (Terraform, CloudFormation)

Experience with load testing and performance benchmarking

Apply for this position