AWS Lead Data Engineer
Role details
Job location
Tech stack
Job description
We do not need architects or Big Data engineers. We need an AWS Lead Data engineer who can perform hands-on coding on these tools: Python, Pandas, PySpark, Terraform, AWS Glue, Lambda, S3, Redshift, EMR. Lets make sure any candidates you submit have most, if not all, of these critical skills., We are seeking an experienced AWS Lead Data Engineer to join our dynamic team. The ideal candidate will have 5+ years of experience in data engineering with a strong focus on AWS technologies. This role involves designing, developing, and maintaining scalable data pipelines and processing systems. The candidate should be adept at managing and optimizing data architectures and be passionate about data-driven solutions. Knowledge of machine learning is a plus., * Design and implement scalable data pipelines using AWS services such as Glue, Redshift, S3, Lambda, EMR, Athena
- Develop and maintain ELT processes to transform and integrate data from various sources.
- Collaborate with data scientists, analysts, and other stakeholders to understand data requirements and deliver high-quality data solutions.
- Optimize and tune performance of data pipelines and queries.
- Ensure data quality and integrity through robust testing and validation processes.
- Implement data security and compliance best practices.
- Monitor and troubleshoot data pipeline issues and ensure timely resolution.
- Stay updated with the latest developments in AWS data engineering technologies and best practices.
Requirements
- Bachelor s or Master s degree in Computer Science, Information Technology, or a related field.
- 5+ years of experience in data engineering with a focus on AWS technologies.
- Expertise in AWS services such as Glue, Redshift, S3, Lambda, EMR, Athena,
- Strong programming skills in Python, Pandas, SQL
- Experience with database systems such as AWS RDS, Postgres and SAP HANA.
- Knowledge of data modeling, ETL processes, and data warehousing concepts.
- Familiarity with CI/CD pipelines and version control systems (e.g., Git).
- Experience writing infrastructure as code using Terraform.
- Familiarity with Glue Notebooks, Sagemaker Notebooks, Textract, Rekognition, Bedrock, and any GenAI/LLM tools
- Strong problem-solving skills and attention to detail.
- Excellent communication and collaboration skills.
Nice to Have:
- AWS Certification (e.g., AWS Certified Data Analytics, AWS Certified Solutions Architect).
- Experience with machine learning frameworks and libraries (e.g., TensorFlow, PyTorch, Scikit-learn).
- Knowledge of AWS SageMaker and its integration within data pipelines.
- Knowledge of big data technologies such as Apache Spark, Hadoop, or Kafka.
- Experience with data visualization tools like Tableau, Power BI, or AWS QuickSight.
- Familiarity with Azure DevOps and Azure Pipelines.
- Familiarity with Data Catalog and Governance tools such as AWS DQ, Collibra, and profiling tools such as AWS Databrew
Certifications that they would like to see
-
AWS Certified Developer
-
Denodo Platform 6.0 Certified Developer
-
Tableau Desktop Qualified Associate
-
Hortonworks HDP Certified Administrator (HDPCA)
-
Cloudera Hadoop Developer Certification
-
Oracle Data Warehousing 11g Essentials Certification
-
Oracle Business Intelligence 10 Foundation Essentials