AWS Data Engineer (Real-Time/ Streaming)

The Modern

Atlanta, United States of America

yesterday

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Experience level

Senior

Job location

Atlanta, United States of America

Tech stack

Amazon Web Services (AWS)

Cloud Computing

Continuous Integration

Data Integrity

ETL

Fraud Prevention and Detection

Monitoring of Systems

Python

Enterprise Messaging Systems

Data Streaming

Systems Architecture

Systems Integration

Amazon Connect

Parquet

Data Storage Technologies

Delivery Pipeline

Servicebus

Event Driven Architecture

Data Lake

PySpark

Amazon Web Services (AWS)

Kafka

Terraform

Stream Processing

Data Pipelines

Job description

Develop a comprehensive plan for migrating near real-time fraud detection campaigns from on-premises systems to AWS.
Design and implement event-driven architectures to process inbound dialer data (fraud events) using services such as Amazon EventBridge, Kafka, Kinesis Data Streams, and Kinesis Firehose.
Build and manage scalable data pipelines using AWS Glue (ETL jobs, Crawlers), PySpark, and Python for data ingestion, transformation, and processing.
Configure and manage Glue Crawlers to automatically discover schemas and update the Data Catalog.
Store and optimize data using Parquet format and enable analytics through Amazon Athena for efficient querying.
Develop integrations between Customer Profiles and messaging platforms to automatically trigger profile updates and downstream processes.
Implement automation to trigger fraud-related outbound calls based on updates in customer profiles.
Design and orchestrate workflows using AWS Step Functions to manage complex processing pipelines.
Provision and manage cloud infrastructure using Terraform (Infrastructure as Code).
Optimize system architecture for scalability, reliability, cost-efficiency, and ensure data integrity and security.
Conduct end-to-end testing of the entire framework to validate functionality, performance, and reliability.
Deploy, automate, and manage resources using CI/CD pipelines.
Continuously monitor system performance and implement optimizations post-deployment.
Maintain detailed documentation of architecture, workflows, and operational processes.

Technical Skills

Strong expertise in AWS services including:
Lambda
S3
EventBridge
Kinesis (Data Streams & Firehose)
Glue (ETL + Crawlers)
Step Functions
Amazon Connect
Athena
Macie

Proficient in:

Python
PySpark

Requirements

Glue Crawlers for schema discovery and cataloging
Parquet-based data storage
Building scalable data pipelines

Strong understanding of event-driven architectures (Pub/Sub model)

Hands-on experience with Terraform (Infrastructure as Code)

Familiarity with CI/CD tools and automation pipelines

Preferred Qualifications

AWS Certifications:
AWS Certified Developer
AWS Solutions Architect

Experience with:

Messaging platforms like Kafka and Amazon EventBridge
Designing real-time data processing systems
Using Glue Crawlers + Athena for data lake architectures

Role details

Job location

Tech stack

Job description

Requirements

Apply for this position

Good distractions

Moments

Videos View all