Python/PySpark Engineer

Spectraforce

Salt Lake City, United States of America

1 month ago

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Job location

Salt Lake City, United States of America

Tech stack

Airflow

Amazon Web Services (AWS)

Data analysis

Azure

Big Data

Continuous Integration

Data Architecture

Data Governance

Data Integrity

ETL

Data Warehousing

Distributed Systems

Github

Hadoop

Hive

Python

Performance Tuning

SQL Databases

Data Streaming

Data Storage Technologies

Spark

Caching

Data Lake

PySpark

Kafka

Spark Streaming

Data Management

Data Pipelines

Jenkins

Databricks

Job description

Looking for an experienced PySpark Data Engineer to support banking data platforms, regulatory reporting, and large-scale transaction processing systems. The role involves building scalable data pipelines, ensuring data integrity, and enabling analytics across financial systems., * Design and develop high-performance data pipelines using PySpark & Spark SQL for BFS use cases

Process large-scale transactional, customer, and risk data across distributed systems
Build and maintain ETL/ELT pipelines for regulatory, reporting, and analytics requirements
Integrate data from multiple BFS systems (Core Banking, Payments, Risk, AML, etc.)
Implement data quality checks, reconciliation, and audit controls
Optimize Spark workloads (partitioning, joins, caching, performance tuning)
Work with data lakes/lakehouse (Delta Lake, S3, ADLS) for governed data storage
Ensure compliance with data governance, security, and regulatory standards (e.g., BCBS, GDPR, SOX)
Collaborate with business analysts, risk teams, and downstream reporting teams

Requirements

Strong expertise in Python + PySpark (RDD, DataFrames, Spark SQL)
Solid experience in banking/financial domain data (transactions, accounts, payments, risk)
Strong hands-on in SQL and data warehousing concepts
Experience with ETL pipelines & data pipeline architecture
Knowledge of big data ecosystem (Spark, Hive, Hadoop, Kafka)
Experience in cloud platforms (AWS / Azure)
Hands-on with Databricks / Spark clusters
Understanding of data governance, audit, lineage, and compliance

Good-to-Have Skills:

Experience in Regulatory Reporting / Risk / AML / Fraud Analytics
Knowledge of Delta Lake / Lakehouse architecture
Exposure to Airflow / orchestration tools
CI/CD tools (Jenkins, GitHub Actions)
Understanding of streaming data (Kafka / Spark Streaming)

Role details

Job location

Tech stack

Job description

Requirements

Apply for this position

Good distractions

Moments

Videos View all