Python/PySpark Engineer

Spectraforce
Salt Lake City, United States of America
3 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English

Job location

Salt Lake City, United States of America

Tech stack

Airflow
Amazon Web Services (AWS)
Amazon Web Services (AWS)
Data analysis
Azure
Big Data
Continuous Integration
Data Architecture
Data Governance
Data Integrity
ETL
Data Warehousing
Distributed Systems
Github
Hadoop
Hive
Python
Performance Tuning
SQL Databases
Data Streaming
Data Storage Technologies
Spark
Caching
Data Lake
PySpark
Kafka
Spark Streaming
Data Management
Data Pipelines
Jenkins
Databricks

Job description

Looking for an experienced PySpark Data Engineer to support banking data platforms, regulatory reporting, and large-scale transaction processing systems. The role involves building scalable data pipelines, ensuring data integrity, and enabling analytics across financial systems., * Design and develop high-performance data pipelines using PySpark & Spark SQL for BFS use cases

  • Process large-scale transactional, customer, and risk data across distributed systems
  • Build and maintain ETL/ELT pipelines for regulatory, reporting, and analytics requirements
  • Integrate data from multiple BFS systems (Core Banking, Payments, Risk, AML, etc.)
  • Implement data quality checks, reconciliation, and audit controls
  • Optimize Spark workloads (partitioning, joins, caching, performance tuning)
  • Work with data lakes/lakehouse (Delta Lake, S3, ADLS) for governed data storage
  • Ensure compliance with data governance, security, and regulatory standards (e.g., BCBS, GDPR, SOX)
  • Collaborate with business analysts, risk teams, and downstream reporting teams

Requirements

  • Strong expertise in Python + PySpark (RDD, DataFrames, Spark SQL)
  • Solid experience in banking/financial domain data (transactions, accounts, payments, risk)
  • Strong hands-on in SQL and data warehousing concepts
  • Experience with ETL pipelines & data pipeline architecture
  • Knowledge of big data ecosystem (Spark, Hive, Hadoop, Kafka)
  • Experience in cloud platforms (AWS / Azure)
  • Hands-on with Databricks / Spark clusters
  • Understanding of data governance, audit, lineage, and compliance

Good-to-Have Skills:

  • Experience in Regulatory Reporting / Risk / AML / Fraud Analytics
  • Knowledge of Delta Lake / Lakehouse architecture
  • Exposure to Airflow / orchestration tools
  • CI/CD tools (Jenkins, GitHub Actions)
  • Understanding of streaming data (Kafka / Spark Streaming)

Apply for this position