Celery on AWS ECS - the art of background tasks & continuous deployment
Stop losing background tasks during deployments. Learn the battle-tested Celery configurations and task patterns for a resilient, zero-downtime system on AWS ECS.
#1about 2 minutes
Understanding the role of background tasks in applications
Background tasks are essential for handling long-running processes, scheduled jobs like newsletters, and operations that require retries without blocking the user.
#2about 3 minutes
Choosing Celery and AWS ECS with Fargate for your stack
Celery is the most widely used Python task queue, and AWS ECS with Fargate provides a serverless, scalable environment for running workers without managing servers.
#3about 2 minutes
Key AWS ECS settings for reliable Celery workers
Configure a long stop timeout (120 seconds) and set minimum healthy percent to 50% to give workers time to shut down gracefully during deployments.
#4about 3 minutes
Handling interruptions from continuous deployment and scaling
Frequent deployments and auto-scaling actions on ECS interrupt running tasks, which can prevent long-running jobs from ever completing and risk task loss.
#5about 10 minutes
Configuring Celery for task reliability and visibility
Set `task_acks_late`, `task_reject_on_worker_lost`, a short `visibility_timeout`, and a `prefetch_multiplier` of one to prevent task loss and duplication.
#6about 3 minutes
Remapping SIGTERM to SIGQUIT for immediate cold shutdowns
Use the `BILLIARD_REMAP_SIGTERM` environment variable to remap the SIGTERM signal to trigger a cold shutdown, ensuring interrupted tasks are immediately re-queued.
#7about 3 minutes
Designing tasks to be short-lived and idempotent
Design tasks to be idempotent and aim for a maximum processing time under 15 minutes to reduce the impact of interruptions and ensure reliable execution.
#8about 7 minutes
Using fan-out and batching patterns to manage long workloads
Break down large jobs using the fan-out pattern for parallel processing or the batching pattern for sequential, interruptible processing of smaller chunks.
#9about 3 minutes
Using Redis for task locking to prevent duplicate execution
Implement a locking mechanism using Redis to ensure that only one worker can process a specific task at a time, preventing race conditions and duplicate work.
#10about 4 minutes
Reviewing code examples for fan-out, batching, and locking
A walkthrough of Python code demonstrates how to implement the fan-out, batching, and Redis-based locking patterns for robust Celery tasks.
#11about 16 minutes
Answering common questions about Celery on AWS
Discussion on topics including the generality of interruption problems, collecting logs from killed workers, and finding additional learning resources for Celery and AWS.
Related jobs
Jobs that call for the skills explored in this talk.
Why Attend a Developer Event?Modern software engineering moves too fast for documentation alone. Attending a world-class event is about shifting from tactical execution to strategic leadership.
Skill Diversification: Break out of your specific tech stack to see how the industry...
Daniel Cranney
Dev Digest 213: Petrol Prices, Agentic Workflows, AI Skills and CODE100!Inside last week’s Dev Digest 213 .
🤫 Don’t tell your LLM that it is an expert
👻 AI generated code is invisible
🔄 Learn about agentic workflows
🛡️ Linux Foundation sponsors fight against AI slop
🦠 1M users infected by Chrome extension
🫃 The why of J...
From learning to earning
Jobs that call for the skills explored in this talk.