Cloud Systems Engineer

CubeSmart
Malvern, United States of America
18 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior

Job location

Malvern, United States of America

Tech stack

Amazon Web Services (AWS)
Bash
Ubuntu (Operating System)
Cloud Computing Security
Cloud Engineering
Computer Programming
Computer Networks
Continuous Integration
Software Design Patterns
Linux
DevOps
Distributed Systems
Github
Monitoring of Systems
Identity and Access Management
Python
Key Management
PostgreSQL
Octopus Deploy
OpenID
Redis
Reliability Engineering
Ansible
Prometheus
Zero Trust Network Access
Datadog
Scripting (Bash/Python/Go/Ruby)
Load Balancing
Cloud Platform System
Amazon Web Services (AWS)
Grafana
Caching
Reliability of Systems
Containerization
Gitlab-ci
Kubernetes
Hashicorp
Amazon Web Services (AWS)
Cloudwatch
Terraform
Docker
Pagerduty
Jenkins
Static Application Security Testing
Microservices
Dynamic Application Security Testing

Job description

  • Ensure uptime, reliability, and performance of AWS-hosted, Linux-based (Ubuntu) production systems and associated lower environments
  • Build and optimize observability using tools like Datadog, CloudWatch, Prometheus/Grafana, and PagerDuty
  • Working closely with the Dev teams, you will be diagnosing site issues, mitigating impact, and restoring system reliability while communicating clearly with stakeholders.
  • Lead incident response, root cause analysis, and post-incident reviews
  • Participate in on-call rotations and support 24/7 production environments

Cloud Architecture & Automation

  • Architect and implement fully automated, ephemeral, and immutable AWS production and lower environments
  • Design scalable, resilient distributed systems using AWS best practices
  • Eliminate manual processes through Infrastructure as Code (Terraform, Ansible, Packer)
  • Build and maintain CI/CD and GitOps workflows (Jenkins, GitHub Actions, GitLab CI, ArgoCD/Flux)
  • Develop automation and tooling using Python and Bash to reduce operational toil

Infrastructure & Platform Engineering

  • Deploy and manage AWS services including EKS, ECS, Fargate, Lambda, and RDS (Aurora PostgreSQL), Opensearch, Redis,Elasticache
  • Design and manage networking components such as Transit Gateways, load balancers, and service meshes
  • Implement caching, microservices, and distributed system design patterns

Security & Governance

  • Architect and implement zero-trust security models using IAM, SCPs, and OIDC
  • Embed security into CI/CD pipelines using SAST/DAST tools (e.g., Snyk)
  • Ensure compliance through automated auditing, backup strategies, and governance controls

Collaboration, Leadership & Strategy

  • Partner with development, security, and operations teams to build reliable, observable platforms
  • Document systems, runbooks, and operational procedures
  • Drive FinOps initiatives for cost optimization and forecasting
  • Integrate infrastructure changes into ITIL-compliant workflows (e.g., Freshservice)
  • Influence architectural decisions and promote engineering best practices across teams

Requirements

We are seeking a highly skilled Site Reliability & Cloud Systems Engineer to design, build, and operate scalable, secure, and highly automated cloud platforms in AWS. This role combines hands-on reliability engineering with cloud architecture and automation expertise, with a strong emphasis on building immutable infrastructure and improving system resilience., * 6-10+ years of experience in Site Reliability Engineering, DevOps, or Cloud Engineering roles

  • Deep hands-on expertise with AWS services and cloud architecture
  • Strong Linux systems engineering experience (Ubuntu preferred)
  • Proven experience with Infrastructure as Code (Terraform, Ansible, etc.)
  • Experience building and maintaining CI/CD pipelines
  • Proficiency in scripting/programming (Python, Bash)
  • Hands-on experience with monitoring and observability platforms
  • Solid understanding of cloud security principles (IAM, KMS, Secrets Management, Ansible Vault, Hashicorp Vault)
  • Bachelor's degree or equivalent practical experience, * Experience with containerization and orchestration (Docker, Kubernetes, EKS/ECS)
  • Familiarity with GitOps tools such as ArgoCD or Flux
  • Experience with SAST/DAST tools and secure SDLC practices
  • Knowledge of distributed systems, caching, and microservices architectures
  • Experience with FinOps and cost optimization strategies
  • Exposure to ITIL processes and service management platforms

About the company

At CubeSmart, we're intentional about culture. You can experience it everywhere from our mission statement of "genuine care" to our "It's What's Inside That Counts" tagline to calling each other "teammates" rather than employees. This spirit fosters a fun and collaborative environment that has resulted in our rapid growth and being recognized amongst the top in our industry. CubeSmart's award-winning team is made up of people who genuinely care. Teammates care about our customers and the life events and/or business needs they are facing. Teammates are passionate, responsible and understanding. The CubeSmart team is made up of people who have a can-do attitude, are committed to their own success and the success of the company, and lead by example. If this sounds like a team and culture that matches your personal values and motivations, we want to hear from you.

Apply for this position