Infrastructure Engineer

Alldus International Consulting Ltd
San Francisco, United States of America
1 month ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior
Compensation
$ 250K

Job location

San Francisco, United States of America

Tech stack

Artificial Intelligence
Amazon Web Services (AWS)
Amazon Web Services (AWS)
Amazon Web Services (AWS)
Github
Identity and Access Management
Reliability Engineering
TypeScript
Pulumi
Technical Debt
Amazon Web Services (AWS)
Backend
Amazon Web Services (AWS)
Kubernetes
Functional Programming
Cloudwatch
Terraform
Elixir

Job description

Our client, an exciting AI-driven startup, is hiring an Infrastructure Engineer to join the team in San Francisco. The successful candidate will be responsible to design, build and maintain the infrastructure and tooling that drive the next generation of AI products, including shaping how the company builds, operates and monitors its systems in production., * Design, implement and maintain AWS infrastructure across compute, networking, storage, IAM, monitoring, logging and security.

  • Manage infrastructure-as-code tooling to codify, version and deploy systems reliably.
  • Work closely with engineers and stakeholders to map dependencies, build deployment pipelines and ensure seamless rollouts.
  • Balance priorities across feature delivery, reliability, technical debt and infrastructure evolution, as well as making informed decisions, sequencing work effectively and communicating trade-offs clearly.
  • Ensure production system reliability by defining SLAs, setting up monitoring and alerting, managing incident response and contributing to post-mortems for continuous improvement.
  • Identify and execute improvements proactively, enhancing infrastructure performance, scalability and operational efficiency.

Requirements

  • Extensive AWS expertise including EC2, ECS, Lambda, VPC, S3, RDS, IAM, CloudWatch and related services.
  • Proven experience with infrastructure-as-code tools such as Terraform or Pulumi in production environments.
  • Hands-on experience building CI/CD pipelines, for example using GitHub Actions.
  • Strong understanding of reliability engineering, including monitoring, alerting, incident response, capacity planning, chaos testing and load management.
  • Excellent communication skills, able to clearly explain infrastructure, trade-offs, reliability metrics and deployment processes to both technical and non-technical stakeholders.
  • Previous startup experience is a plus.
  • Familiarity with Elixir and TypeScript is desirable.
  • Knowledge of security compliance frameworks such as SOC2 and ISO27001.
  • Experience with Kubernetes/EKS is a bonus.

Benefits & conditions

  • Salary: $185k - $250k DOE.
  • Health insurance for you and your family.
  • 401k plan.

57009

Apply for this position