AWS Data Engineer (Real-Time/ Streaming)

The Modern
Atlanta, United States of America
yesterday

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior

Job location

Atlanta, United States of America

Tech stack

Amazon Web Services (AWS)
Amazon Web Services (AWS)
Cloud Computing
Continuous Integration
Data Integrity
ETL
Fraud Prevention and Detection
Monitoring of Systems
Python
Enterprise Messaging Systems
Data Streaming
Systems Architecture
Systems Integration
Amazon Connect
Parquet
Data Storage Technologies
Delivery Pipeline
Servicebus
Event Driven Architecture
Data Lake
PySpark
Amazon Web Services (AWS)
Kafka
Terraform
Stream Processing
Data Pipelines

Job description

  • Develop a comprehensive plan for migrating near real-time fraud detection campaigns from on-premises systems to AWS.
  • Design and implement event-driven architectures to process inbound dialer data (fraud events) using services such as Amazon EventBridge, Kafka, Kinesis Data Streams, and Kinesis Firehose.
  • Build and manage scalable data pipelines using AWS Glue (ETL jobs, Crawlers), PySpark, and Python for data ingestion, transformation, and processing.
  • Configure and manage Glue Crawlers to automatically discover schemas and update the Data Catalog.
  • Store and optimize data using Parquet format and enable analytics through Amazon Athena for efficient querying.
  • Develop integrations between Customer Profiles and messaging platforms to automatically trigger profile updates and downstream processes.
  • Implement automation to trigger fraud-related outbound calls based on updates in customer profiles.
  • Design and orchestrate workflows using AWS Step Functions to manage complex processing pipelines.
  • Provision and manage cloud infrastructure using Terraform (Infrastructure as Code).
  • Optimize system architecture for scalability, reliability, cost-efficiency, and ensure data integrity and security.
  • Conduct end-to-end testing of the entire framework to validate functionality, performance, and reliability.
  • Deploy, automate, and manage resources using CI/CD pipelines.
  • Continuously monitor system performance and implement optimizations post-deployment.
  • Maintain detailed documentation of architecture, workflows, and operational processes.

Technical Skills

  • Strong expertise in AWS services including:

  • Lambda

  • S3

  • EventBridge

  • Kinesis (Data Streams & Firehose)

  • Glue (ETL + Crawlers)

  • Step Functions

  • Amazon Connect

  • Athena

  • Macie

Proficient in:

  • Python
  • PySpark

Requirements

  • Glue Crawlers for schema discovery and cataloging
  • Parquet-based data storage
  • Building scalable data pipelines

Strong understanding of event-driven architectures (Pub/Sub model)

Hands-on experience with Terraform (Infrastructure as Code)

Familiarity with CI/CD tools and automation pipelines

Preferred Qualifications

  • AWS Certifications:

  • AWS Certified Developer

  • AWS Solutions Architect

Experience with:

  • Messaging platforms like Kafka and Amazon EventBridge
  • Designing real-time data processing systems
  • Using Glue Crawlers + Athena for data lake architectures

Apply for this position