Lead Data Engineer
Role details
Job location
Tech stack
Job description
Python AWS (S3, Lambda, Glue) PySpark SQL DynamoDB Kafka / Kinesis GenAI & LLMs Advanced Java / Core Java, Design and build scalable data engineering solutions on AWS. Develop batch and real-time data pipelines using PySpark, Kafka, and Kinesis. Work on GenAI and Large Language Model (LLM) based applications. Develop and optimize SQL queries for large-scale data processing. Design distributed data processing systems using Spark or similar technologies. Build and maintain data workflows using Airflow or Prefect. Implement CI/CD pipelines and Infrastructure as Code (Terraform, Docker, Kubernetes). Ensure data security, governance, monitoring, and reliability. Lead technical design discussions, code reviews, and mentor junior engineers. Preferred Skills Apache Spark or Ray Airflow / Prefect Docker, Kubernetes, Terraform Snowflake, Redshift, BigQuery Event-Driven Architecture Data Governance & Security
Requirements
Strong hands-on experience in Python and AWS. Experience building large-scale Data Engineering platforms. Good exposure to GenAI and LLM technologies. Strong knowledge of Kafka/Kinesis and real-time data processing. Ability to lead architecture decisions and mentor development teams. Top 3 Skills Python + PySpark AWS (S3, Lambda, Glue, DynamoDB) GenAI / LLMs with Kafka/Kinesis Data Engineering