Software Engineer (Python) Cloud Infrastructure & Platform Development (AWS)
Role details
Job location
Tech stack
Job description
Infrastructure & Reliability
-
Design and implement multi-region AWS disaster recovery solutions, including fallback infrastructure for us-east-1 outages
-
Architect and maintain highly available, scalable cloud infrastructure across multiple AWS regions
-
Ensure infrastructure resilience through chaos engineering and disaster recovery testing
Development
-
Develop and deploy new features using Python and the Open Source Serverless Framework
-
Build and maintain serverless applications (Lambda, API Gateway, DynamoDB, etc.)
-
Write clean, maintainable, and well-tested code following best practices
-
Contribute to architectural decisions and technical design reviews
Platform Observability
-
Design and implement comprehensive observability solutions for production platforms
-
Set up monitoring, logging, and alerting using tools such as CloudWatch, DataDog, Grafana, or similar
-
Establish SLIs, SLOs, and error budgets to measure platform health
-
Create dashboards and on-call runbooks for incident response
CI/CD & Automation
-
Design, implement, and maintain CI/CD pipelines for automated testing and deployment
-
Automate infrastructure provisioning using Infrastructure as Code (Terraform, CloudFormation, CDK)
-
Implement security scanning, testing, and compliance checks in deployment pipelines
-
Optimize build and deployment processes for speed and reliability
Team Leadership
-
Mentor and manage development teams, fostering a culture of technical excellence
-
Conduct code reviews and provide constructive feedback
-
Facilitate technical discussions and help unblock team members
-
Collaborate with product and engineering teams to deliver on roadmap priorities
Requirements
5+ years of software engineering experience with strong Python development skills
-
3+ years of hands-on experience with AWS services (EC2, Lambda, S3, RDS, VPC, IAM, CloudFormation, etc.)
-
Proven experience building and deploying serverless applications (AWS Lambda, API Gateway, Step Functions)
-
Strong understanding of multi-region architecture and disaster recovery patterns
-
Experience designing and implementing CI/CD pipelines (GitHub Actions, GitLab CI, Jenkins, or similar)
-
Demonstrated experience setting up observability and monitoring solutions
-
Experience managing or mentoring development teams
-
Strong understanding of networking, security, and cloud best practices
-
Excellent problem-solving skills and ability to debug complex distributed systems
Preferred Qualifications
-
Experience with the Serverless Framework (serverless.com)
-
AWS certifications (Solutions Architect, DevOps Engineer, or similar)
-
Experience with Infrastructure as Code tools (Terraform, AWS CDK, CloudFormation)
-
Knowledge of containerization and orchestration (Docker, ECS, Kubernetes)
-
Experience with observability platforms (DataDog, New Relic, Grafana, Prometheus)
-
Familiarity with event-driven architectures and message queuing systems (SQS, SNS, EventBridge)
-
Experience with testing frameworks and test automation
-
Background in Agile/Scrum methodologies
-
Strong communication skills and experience working with cross-functional teams