AWS Data Engineer

Diverse Lynx LLC

Chicago, United States of America

28 days ago

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Experience level

Senior

Compensation

$ 140K

Job location

Chicago, United States of America

Tech stack

Amazon Web Services (AWS)

Apache HTTP Server

Big Data

Information Engineering

ETL

Data Systems

Distributed Data Store

Performance Tuning

Management of Software Versions

Data Processing

Data Storage Technologies

Build Management

Data Lake

PySpark

Amazon Web Services (AWS)

Data Management

Data Pipelines

Databricks

Job description

We are looking for a skilled Data Engineer to design and build scalable data solutions using PySpark and AWS services. The ideal candidate will have hands-on experience in building modern data platforms using Apache Iceberg and implementing Medallion architecture on AWS., * Design and implement end-to-end data solutions using PySpark, ensuring scalability and performance.

Build and manage data pipelines using AWS services such as AWS Glue, EMR, and Lambda.
Develop data products using PySpark + AWS Glue stack.
Implement Medallion Architecture (Bronze, Silver, Gold layers) for structured data processing.
Work with Apache Iceberg tables for efficient data storage, versioning, and schema evolution.
Ensure data quality, governance, and optimization across pipelines.
Collaborate with cross-functional teams to understand business requirements and translate them into technical solutions.
Optimize data processing jobs and improve performance and cost-efficiency on AWS.

Requirements

Strong experience in PySpark for data processing and pipeline development.
Hands-on experience with AWS ecosystem (Glue, EMR, Lambda, S3).
Experience implementing Medallion Architecture.
Practical knowledge of Apache Iceberg or similar table formats.
Strong understanding of distributed data processing and big data frameworks.
Experience designing scalable and reliable data pipelines.
Good understanding of data modeling and ETL/ELT concepts., * Experience working outside of Databricks-only environments (ability to build solutions using native AWS stack).
Familiarity with modern data lake architectures and open table formats.
Knowledge of performance tuning and cost optimization in AWS.
Experience with CI/CD pipelines for data engineering workflows.

What the Client is Specifically Looking For

Engineers who can independently design solutions using PySpark (not limited to Databricks).
Strong expertise in AWS-native data engineering tools.
Hands-on implementation experience with Apache Iceberg (preferred over Delta).
Ability to build data products using Glue + PySpark stack.
Clear understanding and implementation of Medallion architecture using AWS services.