Python/PySpark Engineer
Spectraforce
Salt Lake City, United States of America
3 days ago
Role details
Contract type
Permanent contract Employment type
Full-time (> 32 hours) Working hours
Regular working hours Languages
EnglishJob location
Salt Lake City, United States of America
Tech stack
Airflow
Amazon Web Services (AWS)
Amazon Web Services (AWS)
Data analysis
Azure
Big Data
Continuous Integration
Data Architecture
Data Governance
Data Integrity
ETL
Data Warehousing
Distributed Systems
Github
Hadoop
Hive
Python
Performance Tuning
SQL Databases
Data Streaming
Data Storage Technologies
Spark
Caching
Data Lake
PySpark
Kafka
Spark Streaming
Data Management
Data Pipelines
Jenkins
Databricks
Job description
Looking for an experienced PySpark Data Engineer to support banking data platforms, regulatory reporting, and large-scale transaction processing systems. The role involves building scalable data pipelines, ensuring data integrity, and enabling analytics across financial systems., * Design and develop high-performance data pipelines using PySpark & Spark SQL for BFS use cases
- Process large-scale transactional, customer, and risk data across distributed systems
- Build and maintain ETL/ELT pipelines for regulatory, reporting, and analytics requirements
- Integrate data from multiple BFS systems (Core Banking, Payments, Risk, AML, etc.)
- Implement data quality checks, reconciliation, and audit controls
- Optimize Spark workloads (partitioning, joins, caching, performance tuning)
- Work with data lakes/lakehouse (Delta Lake, S3, ADLS) for governed data storage
- Ensure compliance with data governance, security, and regulatory standards (e.g., BCBS, GDPR, SOX)
- Collaborate with business analysts, risk teams, and downstream reporting teams
Requirements
- Strong expertise in Python + PySpark (RDD, DataFrames, Spark SQL)
- Solid experience in banking/financial domain data (transactions, accounts, payments, risk)
- Strong hands-on in SQL and data warehousing concepts
- Experience with ETL pipelines & data pipeline architecture
- Knowledge of big data ecosystem (Spark, Hive, Hadoop, Kafka)
- Experience in cloud platforms (AWS / Azure)
- Hands-on with Databricks / Spark clusters
- Understanding of data governance, audit, lineage, and compliance
Good-to-Have Skills:
- Experience in Regulatory Reporting / Risk / AML / Fraud Analytics
- Knowledge of Delta Lake / Lakehouse architecture
- Exposure to Airflow / orchestration tools
- CI/CD tools (Jenkins, GitHub Actions)
- Understanding of streaming data (Kafka / Spark Streaming)