AWS Data Engineer

Thrive IT Systems Ltd
Charing Cross, United Kingdom
4 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior

Job location

Charing Cross, United Kingdom

Tech stack

4G (Telecommunication)
Airflow
Amazon Web Services (AWS)
Amazon Web Services (AWS)
Data analysis
Apache HTTP Server
Network Analysis
Continuous Integration
Information Engineering
Data Governance
ETL
Data Systems
Data Warehousing
Software Debugging
DevOps
Fraud Prevention and Detection
Identity and Access Management
Python
Performance Tuning
SQL Databases
Management of Software Versions
Data Processing
Data Ingestion
Spark
GIT
Cloudformation
Data Lake
Semi-structured Data
Kafka
OSS/BSS
Terraform
Data Pipelines
Serverless Computing
Redshift

Job description

We are seeking an experienced AWS Data Engineer with strong expertise in ETL pipelines, Redshift, Iceberg, Athena, and S3 to support large-scale data processing and analytics initiatives in the telecom domain. The candidate will work closely with data architects, business analysts, and cross-functional teams to build scalable and efficient data solutions supporting network analytics, customer insights, billing systems, and telecom OSS/BSS workflows., 1. Data Engineering & ETL Development

  • Design, develop, and maintain ETL/ELT pipelines using AWS-native services (Glue, Lambda, EMR, Step Functions).
  • Implement data ingestion from telecom systems like OSS/BSS, CDRs, mediation systems, CRM, billing, network logs.
  • Optimize ETL workflows for large-scale telecom datasets (high volume, high velocity).
  1. Data Warehousing (Redshift)
  • Build and manage scalable Amazon Redshift clusters for reporting and analytics.
  • Create and optimize schemas, tables, distribution keys, sort keys, and workload management.
  • Implement Redshift Spectrum to query data in S3 using external tables.
  1. Data Lake & Iceberg
  • Implement and maintain Apache Iceberg tables on AWS for schema evolution and ACID operations.
  • Build Iceberg-based ingestion and transformation pipelines using Glue, EMR, or Spark.
  • Ensure high performance for petabyte-scale telecom datasets (CDRs, tower logs, subscriber activity).
  1. Querying & Analytics (Athena)
  • Develop and optimize Athena queries for operational and analytical reporting.
  • Integrate Athena with S3/Iceberg for low-cost, serverless analytics.
  • Manage Glue Data Catalog integrations and table schema management.
  1. Storage (S3) & Data Lake Architecture
  • Design secure, cost-efficient S3 data lake structures (bronze/silver/gold zones).
  • Implement data lifecycle policies, versioning, and partitioning strategies.
  • Ensure data governance, metadata quality, and security (IAM, Lake Formation).
  1. Telecom Domain Expertise
  • Understand telecom-specific datasets such as:

o CDR, xDR, subscriber data

o Network KPIs (4G/5G tower logs)

o Customer lifecycle & churn data

o Billing & revenue assurance

  • Build models and pipelines to support network analytics, customer 360, churn prediction, fraud detection, etc.
  1. Performance Optimization & Monitoring
  • Tune Spark/Glue jobs for performance and cost.
  • Monitor Redshift/Athena/S3 efficiency and implement best practices.
  • Perform data quality checks and validation across pipelines.
  1. DevOps & CI/CD (Preferred)
  • Use Git, CodePipeline, Terraform/CloudFormation for infrastructure and deployments.
  • Automate pipeline deployment and monitoring.

Requirements

  • 8-10 years' experience in data engineering.
  • Strong hands-on experience with:

o AWS S3, Athena, Glue, Redshift, EMR/Spark

o Apache Iceberg

o Python/SQL

  • Experience in telecom data pipelines and handling large-scale structured/semi-structured data.
  • Strong problem-solving, optimization, and debugging skills.

Good to Have Skills

  • Knowledge of AWS Lake Formation, Kafka/Kinesis, Airflow, or Delta/Apache Hudi.
  • Experience with ML workflows in telecom (churn, network prediction).
  • Exposure to 5G network data models.

Apply for this position