AWS Data Engineer
Role details
Job location
Tech stack
Job description
We are seeking an experienced AWS Data Engineer with strong expertise in ETL pipelines, Redshift, Iceberg, Athena, and S3 to support large-scale data processing and analytics initiatives in the telecom domain. The candidate will work closely with data architects, business analysts, and cross-functional teams to build scalable and efficient data solutions supporting network analytics, customer insights, billing systems, and telecom OSS/BSS workflows., 1. Data Engineering & ETL Development
- Design, develop, and maintain ETL/ELT pipelines using AWS-native services (Glue, Lambda, EMR, Step Functions).
- Implement data ingestion from telecom systems like OSS/BSS, CDRs, mediation systems, CRM, billing, network logs.
- Optimize ETL workflows for large-scale telecom datasets (high volume, high velocity).
- Data Warehousing (Redshift)
- Build and manage scalable Amazon Redshift clusters for reporting and analytics.
- Create and optimize schemas, tables, distribution keys, sort keys, and workload management.
- Implement Redshift Spectrum to query data in S3 using external tables.
- Data Lake & Iceberg
- Implement and maintain Apache Iceberg tables on AWS for schema evolution and ACID operations.
- Build Iceberg-based ingestion and transformation pipelines using Glue, EMR, or Spark.
- Ensure high performance for petabyte-scale telecom datasets (CDRs, tower logs, subscriber activity).
- Querying & Analytics (Athena)
- Develop and optimize Athena queries for operational and analytical reporting.
- Integrate Athena with S3/Iceberg for low-cost, serverless analytics.
- Manage Glue Data Catalog integrations and table schema management.
- Storage (S3) & Data Lake Architecture
- Design secure, cost-efficient S3 data lake structures (bronze/silver/gold zones).
- Implement data lifecycle policies, versioning, and partitioning strategies.
- Ensure data governance, metadata quality, and security (IAM, Lake Formation).
- Telecom Domain Expertise
- Understand telecom-specific datasets such as:
o CDR, xDR, subscriber data
o Network KPIs (4G/5G tower logs)
o Customer lifecycle & churn data
o Billing & revenue assurance
- Build models and pipelines to support network analytics, customer 360, churn prediction, fraud detection, etc.
- Performance Optimization & Monitoring
- Tune Spark/Glue jobs for performance and cost.
- Monitor Redshift/Athena/S3 efficiency and implement best practices.
- Perform data quality checks and validation across pipelines.
- DevOps & CI/CD (Preferred)
- Use Git, CodePipeline, Terraform/CloudFormation for infrastructure and deployments.
- Automate pipeline deployment and monitoring.
Requirements
- 8-10 years' experience in data engineering.
- Strong hands-on experience with:
o AWS S3, Athena, Glue, Redshift, EMR/Spark
o Apache Iceberg
o Python/SQL
- Experience in telecom data pipelines and handling large-scale structured/semi-structured data.
- Strong problem-solving, optimization, and debugging skills.
Good to Have Skills
- Knowledge of AWS Lake Formation, Kafka/Kinesis, Airflow, or Delta/Apache Hudi.
- Experience with ML workflows in telecom (churn, network prediction).
- Exposure to 5G network data models.