Data Engineer (Senior Level) - AWS & Streaming

TUPPL Technology Inc
Austin, United States of America
4 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior

Job location

Austin, United States of America

Tech stack

Amazon Web Services (AWS)
Profiling
Data Validation
Information Engineering
Data Governance
Data Integrity
ETL
DevOps
Distributed Computing Environment
Python
Cloud Services
SQL Databases
Data Streaming
Data Processing
AWS Lambda
Cloudformation
Data Lake
PySpark
Apache Flink
Amazon Web Services (AWS)
Kafka
Data Management
Terraform
Stream Processing
Data Pipelines

Job description

We are seeking a Mid-Senior Data Engineer with strong expertise in AWS-based data engineering, real-time streaming technologies, and enterprise-grade data quality frameworks. The ideal candidate will design, build, and optimize scalable batch and streaming data pipelines, implement robust data validation and monitoring processes, and support mission-critical analytics platforms., * Develop and maintain scalable ETL/ELT pipelines using AWS Glue, PySpark, and Python

  • Build event-driven workflows using AWS Lambda

  • Design and manage real-time streaming solutions using Kafka, KSQL, and Apache Flink

  • Implement and enforce comprehensive data quality frameworks, including validation, profiling, monitoring, and reconciliation

  • Optimize data processing performance, scalability, reliability, and cost in cloud environments

  • Collaborate with cross-functional teams to deliver reliable, production-grade data platforms and ensure data integrity across the pipeline

Requirements

  • Strong hands-on experience with Python and PySpark

  • Proven expertise in AWS Glue, Lambda, and other cloud-native data services

  • Solid experience with the Kafka ecosystem (topics, partitions, consumer groups, streaming patterns)

  • Demonstrated experience building and supporting data quality frameworks (validation rules, reconciliation checks, profiling, anomaly detection)

  • Strong understanding of distributed data processing and scalable architecture patterns

Good-to-Have Skills:

  • Experience with Apache Flink for real-time stream processing and stateful computations

  • Knowledge of KSQL or other streaming SQL engines

  • Exposure to CI/CD pipelines, IaC (Terraform/CloudFormation), and DevOps practices

  • Familiarity with data lake/lakehouse architectures and table formats such as Iceberg, Delta, or Hudi

  • Experience working in enterprise or financial data environments

Apply for this position