AWS Data Engineer

Thrive IT Systems Ltd

Charing Cross, United Kingdom

4 days ago

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Experience level

Senior

Job location

Charing Cross, United Kingdom

Tech stack

4G (Telecommunication)

Airflow

Amazon Web Services (AWS)

Data analysis

Apache HTTP Server

Network Analysis

Continuous Integration

Information Engineering

Data Governance

ETL

Data Systems

Data Warehousing

Software Debugging

DevOps

Fraud Prevention and Detection

Identity and Access Management

Python

Performance Tuning

SQL Databases

Management of Software Versions

Data Processing

Data Ingestion

Spark

GIT

Cloudformation

Data Lake

Semi-structured Data

Kafka

OSS/BSS

Terraform

Data Pipelines

Serverless Computing

Redshift

Job description

We are seeking an experienced AWS Data Engineer with strong expertise in ETL pipelines, Redshift, Iceberg, Athena, and S3 to support large-scale data processing and analytics initiatives in the telecom domain. The candidate will work closely with data architects, business analysts, and cross-functional teams to build scalable and efficient data solutions supporting network analytics, customer insights, billing systems, and telecom OSS/BSS workflows., 1. Data Engineering & ETL Development

Design, develop, and maintain ETL/ELT pipelines using AWS-native services (Glue, Lambda, EMR, Step Functions).
Implement data ingestion from telecom systems like OSS/BSS, CDRs, mediation systems, CRM, billing, network logs.
Optimize ETL workflows for large-scale telecom datasets (high volume, high velocity).

Data Warehousing (Redshift)

Build and manage scalable Amazon Redshift clusters for reporting and analytics.
Create and optimize schemas, tables, distribution keys, sort keys, and workload management.
Implement Redshift Spectrum to query data in S3 using external tables.

Data Lake & Iceberg

Implement and maintain Apache Iceberg tables on AWS for schema evolution and ACID operations.
Build Iceberg-based ingestion and transformation pipelines using Glue, EMR, or Spark.
Ensure high performance for petabyte-scale telecom datasets (CDRs, tower logs, subscriber activity).

Querying & Analytics (Athena)

Develop and optimize Athena queries for operational and analytical reporting.
Integrate Athena with S3/Iceberg for low-cost, serverless analytics.
Manage Glue Data Catalog integrations and table schema management.

Storage (S3) & Data Lake Architecture

Design secure, cost-efficient S3 data lake structures (bronze/silver/gold zones).
Implement data lifecycle policies, versioning, and partitioning strategies.
Ensure data governance, metadata quality, and security (IAM, Lake Formation).

Telecom Domain Expertise

Understand telecom-specific datasets such as:

o CDR, xDR, subscriber data

o Network KPIs (4G/5G tower logs)

o Customer lifecycle & churn data

o Billing & revenue assurance

Build models and pipelines to support network analytics, customer 360, churn prediction, fraud detection, etc.

Performance Optimization & Monitoring

Tune Spark/Glue jobs for performance and cost.
Monitor Redshift/Athena/S3 efficiency and implement best practices.
Perform data quality checks and validation across pipelines.

DevOps & CI/CD (Preferred)

Use Git, CodePipeline, Terraform/CloudFormation for infrastructure and deployments.
Automate pipeline deployment and monitoring.

Requirements

8-10 years' experience in data engineering.
Strong hands-on experience with:

o AWS S3, Athena, Glue, Redshift, EMR/Spark

o Apache Iceberg

o Python/SQL

Experience in telecom data pipelines and handling large-scale structured/semi-structured data.
Strong problem-solving, optimization, and debugging skills.

Good to Have Skills