SaaS Cloud Engineer
Role details
Job location
Tech stack
Job description
GE Vernova's GridOS Platform Engineering team is building the next generation of SaaS reliability for critical energy infrastructure. The SaaS Cloud Engineer sits at the heart of our System Reliability Engineering (SRE) team, owning the end-to-end cloud provisioning lifecycle for every customer environment - from Day 0 bootstrap through Day 2 continuous operations. You will work alongside Platform SRE, Observability, Production DevOps, and SecOps engineers to ensure that GridOS SaaS products meet the highest standards of availability, security, and cost efficiency across our US and international customer base., * Automate account bootstrap workflows using Infrastructure as Code (Terraform / AWS CloudFormation) and CI/CD pipelines (GHA / ArgoCD).
- Implement and maintain Cyber Guardrails aligned to GESOS standards, including jumphost configuration, IAM policies, and VPC networking.
- Deploy standardized cloud infrastructure baselines: AWS CloudTrail, CloudWatch, GuardDuty, Security Hub, and Config Rules.
- Configure DNS, network connectivity, and cross-account trust relationships for each customer environment.
Day 1 - Deploy, Scale & Validate
- Collaborate with Platform SRE to define sizing, scaling, and SLO baselines for each customer workload.
- Support progressive delivery pipelines (blue/green, canary) to ensure zero-downtime deployments.
- Integrate cloud-native observability hooks (CloudWatch, synthetic monitors) for new customer environments.
- Assist with acceptance testing validation gates before production cutover.
Day 2 - Secure, Operate & Optimize
- Drive FinOps practices: right-size resources, implement savings plans, and produce monthly cost reports per customer using AWS Cost Explorer.
- Maintain cloud security posture: apply CVE patches, respond to compliance and audit requirements in coordination with SecOps.
- Participate in on-call rotations for incident response (Level 1/2), root cause analysis (RCA), and BC/DR exercises.
- Continuously improve account automation, reducing toil through scripting (Python, Bash) and runbook codification.
- Monitor FinOps KPIs and flag anomalies proactively to the SRE Lead., * AWS (EKS, IAM, CloudTrail, CloudWatch, RDS, MSK, SQS, S3, etc.)
- Infrastructure as Code - Terraform
- Kubernetes - EKS, Rancher
- CI/CD - Jenkins, GitHub Actions
- CD - ArgoCD, Flux
- Scripting - Python / Bash
- FinOps / Cost Explorer
- Observability - Grafana/Prometheus, Splunk, Datadog or Dynatrace
- BC/DR Planning
- Incident Response
- Compliance & SecOps, * Shadow existing account provisioning workflows and document gaps in automation.
- Complete onboarding to the SRE on-call rotation as a backup responder.
- Stand up a personal sandbox environment using the team's IaC templates.
In your first 90 days:
- Deliver the first iteration of automated account provisioning, reducing SLA to under 8 hours.
- Instrument at least one customer environment with full CloudTrail, CloudWatch, and GuardDuty coverage.
- Present a FinOps baseline report for existing customer accounts.
In your first year:
- Achieve the 4-hour account provisioning SLA target across all new customer onboardings.
- Own the Cyber Guardrails automation library and keep it current against GESOS standards.
- Contribute improvements to at least two SRE runbooks and one DR playbook.
Requirements
- 3-5 years of hands-on experience in cloud infrastructure, SRE, or DevOps engineering roles.
- Deep AWS expertise - EC2, EKS, S3, VPC, IAM, CloudTrail, CloudWatch, GuardDuty, Organizations, Control Tower.
- Proven proficiency with Infrastructure as Code - Terraform or AWS CloudFormation.
- Experience with container orchestration (Kubernetes/EKS) and related tooling (Helm, Rancher).
- Working knowledge of CI/CD pipelines - GitHub Actions (GHA) and/or ArgoCD.
- Scripting fluency in Python and/or Bash for automation and operational tooling.
- Demonstrated experience with cloud security best practices: IAM least privilege, security group design, encryption at rest/in-transit.
- Exposure to FinOps concepts - cost allocation tagging, savings plans, Reserved Instances analysis.
Nice to Have
- Experience with multi-tenant SaaS account vending machines (AWS Control Tower, Landing Zone Accelerator).
- Familiarity with Cyber Security Standard and Policies in regulated environments.
- Knowledge of GovCloud or regulated-industry compliance (FedRAMP, NERC CIP, SOC 2).
- Exposure to Backstage IDP or similar developer portals.
- AWS certifications: Solutions Architect (Associate or Professional), DevOps Engineer Professional., Bachelor's Degree in Computer Science or "STEM" Majors (Science, Technology, Engineering and Math) with basic experience., Strong oral and written communication skills Strong business analysis and problem solving skills Proactively engages with cross-functional teams to resolve issues and design solutions using critical thinking and analytical skills and best practices Ability to interact at all levels of the organization and with other GE businesses
Leadership: Excellent communicator, works well in a team environment, and welcomes challenges Self-starter with ability to manage multiple priorities in a fast paced work environment Strong problem solving and analytical skills demonstrated the ability to assimilate new information and understand complex topics