Sr. Cloud Engineer

VENATOR HOLDINGS, LLC

Rochester, United States of America

yesterday

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Experience level

Senior

Job location

Rochester, United States of America

Tech stack

Amazon Web Services (AWS)

Backup Devices

Cloud Computing

Cloud Computing Security

Cloud Engineering

Configuration Management

Continuous Integration

Disaster Recovery

Github

Linux System Administration

Operational Databases

Performance Tuning

Reliability Engineering

Site Reliability Engineering Practices

Prometheus

Software Systems

Systems Architecture

Software Vulnerability Management

Datadog

Data Logging

Pulumi

Cloud Platform System

Autoscaling

Istio

Delivery Pipeline

Mttr

Kubernetes Helm Charts

Kubernetes

Deployment Automation

Bitbucket

Cloudwatch

Terraform

New Relic (SaaS)

Software Version Control

Bamboo

Job description

Our client is building a modern, cloud-native platform that powers connected, data-driven manufacturing operations. Their technology sits at the center of increasingly automated factories, integrating equipment, software systems, and real-time production data into a scalable SaaS platform used by global manufacturers., To support rapid growth and platform scale, they are seeking a Senior Cloud Operations Engineer to own the reliability, performance, and operational excellence of their cloud infrastructure. This is a highly impactful role responsible for ensuring the platform remains highly available, secure, and scalable as adoption continues to grow.

This position is ideal for engineers who thrive in modern cloud environments, enjoy solving complex reliability challenges, and prefer automating everything possible. The right person will combine deep technical expertise with strong operational discipline, helping build a world-class cloud platform supporting real industrial environments., Cloud Operations & Reliability

Maintain and optimize production, staging, and development environments running in Kubernetes on AWS
Implement and manage monitoring, logging, alerting, and observability frameworks
Lead incident response efforts and drive post-incident reviews focused on continuous improvement
Own backup, disaster recovery, and business continuity processes
Perform system capacity planning and performance tuning

Automation & Infrastructure Management

Build and maintain Infrastructure-as-Code using tools such as Terraform or Pulumi
Automate provisioning, configuration management, and environment lifecycle processes
Identify and eliminate operational inefficiencies through automation
Manage secrets, environment configuration, and version control across infrastructure environments

Security & Compliance

Implement and maintain least-privilege access models and cloud security guardrails
Support vulnerability management, patching workflows, and dependency maintenance
Assist with compliance readiness efforts including SOC 2, ISO 27001, or similar frameworks
Ensure proper logging, retention, and audit practices across cloud environments

FinOps / Cost Optimization

Monitor and optimize cloud spend across services and environments
Implement tagging standards, budget alerts, and cost visibility frameworks
Recommend architectural improvements to balance performance and cost efficiency

Collaboration & Leadership

Partner closely with engineering teams to improve reliability, deployment pipelines, and system architecture
Mentor engineers on operational best practices and cloud platform management
Develop runbooks, documentation, and operational standards
Champion reliability engineering principles, operational maturity, and risk reduction practices