Celery on AWS ECS - the art of background tasks & continuous deployment

Stop losing background tasks during deployments. Learn the battle-tested Celery configurations and task patterns for a resilient, zero-downtime system on AWS ECS.

#1about 2 minutes

Understanding the role of background tasks in applications

Background tasks are essential for handling long-running processes, scheduled jobs like newsletters, and operations that require retries without blocking the user.

#2about 3 minutes

Choosing Celery and AWS ECS with Fargate for your stack

Celery is the most widely used Python task queue, and AWS ECS with Fargate provides a serverless, scalable environment for running workers without managing servers.

#3about 2 minutes

Key AWS ECS settings for reliable Celery workers

Configure a long stop timeout (120 seconds) and set minimum healthy percent to 50% to give workers time to shut down gracefully during deployments.

#4about 3 minutes

Handling interruptions from continuous deployment and scaling

Frequent deployments and auto-scaling actions on ECS interrupt running tasks, which can prevent long-running jobs from ever completing and risk task loss.

#5about 10 minutes

Configuring Celery for task reliability and visibility

Set `task_acks_late`, `task_reject_on_worker_lost`, a short `visibility_timeout`, and a `prefetch_multiplier` of one to prevent task loss and duplication.

#6about 3 minutes

Remapping SIGTERM to SIGQUIT for immediate cold shutdowns

Use the `BILLIARD_REMAP_SIGTERM` environment variable to remap the SIGTERM signal to trigger a cold shutdown, ensuring interrupted tasks are immediately re-queued.

#7about 3 minutes

Designing tasks to be short-lived and idempotent

Design tasks to be idempotent and aim for a maximum processing time under 15 minutes to reduce the impact of interruptions and ensure reliable execution.

#8about 7 minutes

Using fan-out and batching patterns to manage long workloads

Break down large jobs using the fan-out pattern for parallel processing or the batching pattern for sequential, interruptible processing of smaller chunks.

#9about 3 minutes

Using Redis for task locking to prevent duplicate execution

Implement a locking mechanism using Redis to ensure that only one worker can process a specific task at a time, preventing race conditions and duplicate work.

#10about 4 minutes

Reviewing code examples for fan-out, batching, and locking

A walkthrough of Python code demonstrates how to implement the fan-out, batching, and Redis-based locking patterns for robust Celery tasks.

#11about 16 minutes

Answering common questions about Celery on AWS

Discussion on topics including the generality of interruption problems, collecting logs from killed workers, and finding additional learning resources for Celery and AWS.