Senior Data Engineer

Randstad
Malvern, United States of America
yesterday

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior
Compensation
$ 149K

Job location

Malvern, United States of America

Tech stack

Artificial Intelligence
Amazon Web Services (AWS)
Big Data
Unix
Computer Programming
Databases
Data as a Services
Data Architecture
Data Validation
Information Engineering
ETL
Data Mining
Database Queries
DevOps
Distributed Systems
Amazon DynamoDB
Python
NumPy
Oracle Applications
Performance Tuning
Cloud Services
SQL Databases
Data Processing
Scripting (Bash/Python/Go/Ruby)
Freeform SQL
Cloud Platform System
Snowflake
FastAPI
Pandas
PySpark
Deployment Automation
Amazon Web Services (AWS)
Data Analytics
Data Management
Software Coding
Terraform
Data Pipelines
Docker
Jenkins

Job description

We are seeking a highly experienced and hands-on Senior Data Engineer to join our Data Engineering teams. You will play a key role in supplementing existing capacity, upgrading our data architecture, and ensuring the highest quality, performance, and cost-efficiency of our data platforms. The work is focused on critical deliverables for personal investment, personal wealth, and comprehensive data analytics, while preparing the platform for a larger strategic move in the future., * Design, build, and maintain high-performance ETL/ELT data pipelines using Python and PySpark.

  • Apply expert-level coding skills to develop and manage data processing jobs leveraging PySpark for distributed computing across large-scale datasets.
  • Take full ownership of the data workflow, including getting data from multiple sources, scrubbing, and validating data to ensure the highest quality.
  • Write and optimize complex, performant SQL queries for data extraction, integrity checks, and performance tuning.
  • Contribute to platform modernization by exploring and increasing the adoption of AI/ML, including using tools like Copilot and Claude for acceleration, and building models to fill data gaps or improve systems.
  • Collaborate with data architects by proposing ideas and great questions, taking ownership as the expert on data, pipelines, and systems.
  • Implement DevOps practices for the automated deployment and orchestration of Python applications and data pipelines (e.g., using Docker, Jenkins, Terraform).
  • Hands on experience with SQL and complex performance tuning.

Requirements

Programming: Expert-level proficiency in Python, including libraries like Pandas and NumPy.

Designing: Designing data pipelines for the data coming from multiple sources

Data Processing: Solid hands-on experience with PySpark for building scalable data workflows

Data Querying: Expert-level knowledge of writing complex SQL queries (Oracle or Snowflake), with proven ability to perform performance tuning on large datasets and complex database code.

Cloud Platform: Robust experience with AWS cloud services and associated data services, specifically:

AWS Glue (ETL)

S3

Lambda

Redshift

DynamoDB, Athena, ECS, EventBridge, OpenSearch, RDS

ETL & Data Management: Robust proficiency in ETL/ELT methodologies and tools, as well as Data Quality, Data Validation, and Anomaly Detection techniques.

Scripting: Working experience with scripting and automation using Unix and Python.

Apply for this position