Data Platform Engineer

Globaldev Group

1 month ago

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Shift work

Languages

English

Experience level

Junior

Job location

Tech stack

Amazon Web Services (AWS)

Azure

Bash

Unix

Command-Line Interface

Cloud Computing

Continuous Integration

Software Debugging

Linux

DevOps

Distributed Systems

DNS

Hadoop

Hadoop Distributed File System

MapReduce

Hive

Subnetting

Job Scheduling

Log Analysis

Performance Tuning

Reliability Engineering

Shell Script

Data Streaming

Pulumi

Load Balancing

Apache Yarn

Spark

Firewalls (Computer Science)

Cloudformation

Gitlab-ci

Kafka

Video Streaming

Terraform

Tez (Software)

Azure

Amazon Web Services (AWS)

Jenkins

Job description

We are looking for a Data Platform Engineer to join our small, dynamic team at a fast-growing company delivering a cutting-edge video streaming OS. You will play a key role in both maintaining our legacy AWS-based data streaming infrastructure and contributing to the migration to a modern Azure-based platform., * Maintain & Optimize (AWS Infrastructure).

Monitor and manage AWS MSK (Managed Streaming for Apache Kafka) clusters: broker health, partition rebalancing, consumer lag, throughput optimization.
Administer AWS EMR (Elastic MapReduce) clusters running Hadoop, Spark, Hive, and Tez: cluster scaling, node health, resource allocation, job scheduling.
Build & Automate (Azure Migration).
Implement infrastructure-as-code using Terraform based on architectural designs provided by the team lead and architect.
Build and deploy cloud resources on Azure.

Requirements

Do you have experience in UNIX?, * 1-4 years of hands-on experience in data platform engineering, site reliability engineering (SRE), DevOps, or distributed systems administration.

Cloud infrastructure expertise with AWS (required) and/or Azure: not just using services, but configuring, tuning, and troubleshooting them.
Kafka/MSK experience: understanding of topics, partitions, consumer groups, replication, broker configurations, and performance tuning.
Hadoop ecosystem administration: HDFS, YARN, MapReduce, Hive, or Spark cluster management and troubleshooting.
Linux/Unix system administration: command-line proficiency, shell scripting (Bash), process monitoring, log analysis.
Infrastructure-as-Code: Terraform (preferred) or similar tools (CloudFormation, ARM templates, Pulumi).
CI/CD and automation: GitLab CI, Jenkins, or similar; building pipelines for infrastructure deployments.
Experience with distributed system monitoring and debugging: understanding logs, metrics, traces, resource contention, and performance bottlenecks.
Comfortable with networking concepts: VPCs, subnets, security groups, load balancers, DNS, firewalls., * Experience migrating workloads between cloud providers.
Exposure to Azure Spark deployment options.
Familiarity with container orchestration.

Benefits & conditions

Flexible work arrangements.
20 working days per year is a Non-Operational Allowance and settled to be use for personal recreation matters and are compensated in full.
Collaborative and supportive team culture.
Truly competitive salary.
Help and support from our caring HR team.