Senior Data Engineer and Architect

BP
2 days ago

Role details

Contract type
Temporary contract
Employment type
Part-time / full-time
Working hours
Regular working hours
Languages
English
Experience level
Senior

Job location

Remote

Tech stack

Amazon Web Services (AWS)
Amazon Web Services (AWS)
Automation of Tests
Code Review
Continuous Integration
Customer Data Management
Identity and Access Management
Python
SQL Stored Procedures
SQL Databases
Data Build Tool (dbt)
Spark
Modularization
PySpark
Amazon Web Services (AWS)
Code Restructuring
Data Pipelines
Databricks

Job description

Mentorship Pipeline Repair dbt Standardization Spark AWS Exostystem SQL Python Refactoring CI/CD, We are looking for a hands-on engineering heavy-hitter to join our Customer Data & Intelligence (CDI) function immediately. We have a rich dataset covering multiple global markets, but our legacy codebase (monolithic SQL and Python scripts) is fragile.

Your immediate mission is to triage, refactor, and stabilize our critical data pipelines. You will take "God Queries" and break them down into modular, testable, and performant dbt models.

Immediate Deliverables (First 30-60 Days):

  • The "Code Rescue": Audit and patch critical queries currently causing data corruptions. Fix logic errors.
  • Modularization Pilot: Implement dbt (Data Build Tool) within our AWS/Databricks environment. Migrate the most critical reporting tables from stored procedures/scripts into dbt models.
  • Automated Quality Gates: Deploy automated tests (using dbt tests or Great Expectations) to check for identity uniqueness and any data errors on critical columns. Stop bad data before it hits the dashboard.

What you will do:

  • Refactoring: Rewrite inefficient legacy SQL to improve performance and readability.
  • Pipeline Repair: Fix error handling in existing AWS Glue/PySpark jobs.
  • Standardization: Establish the "Gold Standard" for what good code looks like. Create the Pull Request template and SQL linting rules that the rest of the team must follow.
  • Mentorship: Act as the "Bar Raiser" in code reviews, establishing standards and teaching the existing team how to write modular, defensive code., * You hate "Toil." You refuse to check data manually; you write scripts to check it for you.
  • You are not afraid of legacy codes. You see a messy codebase as a puzzle to be solved, not a reason to run away.
  • You care about Truth. You understand that "mostly correct" data is useless to a business.

Requirements

Do you have experience in Spark?, * dbt (Data Build Tool): Proven experience setting up dbt from scratch. You know how to structure a project (Staging -> Intermediate -> Marts).

  • Python & Spark: Ability to read and fix PySpark syntax errors and optimize Spark execution plans (Databricks/AWS Glue).
  • AWS Ecosystem: Comfortable with S3, Athena, and IAM permissions.
  • CI/CD: Experience setting up and running tests automatically on commit.

Apply for this position