AI Cloud Engineer
Role details
Job location
Tech stack
Job description
We're looking for a AI Cloud Engineer in Raleigh, NC to join us in fulfilling our mission, while utilizing our values of excellence, improvement, and connection. In this role, you will you'll design, build, and operate AI solutions on AWS. You'll play a key role in operationalizing AI at scale, working across the organization with data scientists, data engineers, and product teams to deliver high-impact GenAI solutions., * GenAI Platform Enablement: Support the development, implementation, scalability and operations of enterprise-grade GenAI platforms.
- Engineering: Engineer robust, secure, governed, and re-useable GenAI capabilities and patterns across the business by leveraging best-practices in architecture, automation, and DevOps.
- Collaboration: Contribute to the development and engineering of internal tools, APIs, and infrastructure to streamline adoption of GenAI services across the organization
- Observability & Quality: Implement end-to-end monitoring, alerting SLA/SLOs, metrics, and cost
- Governance & Risk: Enforce model/version lineage, reproducibility, approvals, rollback plans, auditability, and cost controls aligned to enterprise policies.
- Partner & Mentor: Collaborate with on-shore/off-shore teams; work with key stakeholders on packaging, testing, and performance; contribute to standards and reviews.
- Hands-on Delivery: Prototype new patterns and solutions; troubleshoot production issues across data, model, and infrastructure layers.
Requirements
The ideal candidate has a proven track record of building and operating scalable and secure cloud infrastructure on AWS. Their background includes automating infrastructure using Infrastructure as Code (IaC) and implementing robust CI/CD pipelines. They thrive in cross-functional environments, delivering end-to-end GenAI and cloud solutions., * Bachelor's degree in computer science, information technology, cloud engineering or similar field
- 5+ years experience in AWS Foundations: ECR/ECS, Lambda, API Gateway, S3, Glue/Athena/EMR, RDS/Aurora (PostgreSQL/MySQL), DynamoDB, CloudWatch, IAM, VPC, WAF, Bedrock.
- 5+ years experience with CI/CD: CodeBuild/CodePipeline or GitHub Actions/GitLab; blue/green, canary, and shadow deployments for models and services.
- 3+ years experience with IaC & Platforms: Terraform (preferred), parameterized modules, environment promotion, tagging, and cost governance.
- 3+ years experience in Python, JSON, bash; strong containerization with Docker.
- Proven ability to perform Unit/integration tests for accuracy, KPIs, usage metrics, adoption.
- Demonstrated operational mindset with experience in incident response for model services, SLOs, dashboards, runbooks; strong debugging across data, model, and infra layers.
- Clear communication, collaborative mindset, and a bias to automate & document.
- Experience with resilient application design and secure coding practices.
- Experience using test-driven (TDD) or behavior driven (BDD) development.
PREFERRED QUALIFICATIONS
- Experience with enterprise-grade GenAI platforms such as AWS Bedrock (Anthropic Claude), MS Azure AI Services (OpenAI/ChatGPT)
- Experience adhering to and implementing data security and governance best practices in a highly regulated environment (encryption, RBAC, auditing, regulatory standards - HIPAA, SOC2, etc.)
Benefits & conditions
- We seek out and incorporate diverse views to strengthen our outcomes
- We work on challenging and rewarding projects
- We offer competitive benefits:
- Hybrid work schedule (in-office days Tues/Wed/Thurs)
- Generous Time Off
- 40 Hours of Volunteer Time Off
- Tuition Reimbursement and Student Loan Repayment
- Paid Family Leave and Flexible Spending Accounts
- 401k with up to 5% employer match
- Fitness and Emotional Wellness Reimbursements
- Onsite Gym