PySpark Developer

Rose International
Tampa, United States of America
23 days ago

Role details

Contract type
Temporary contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior
Compensation
$ 156K

Job location

Remote
Tampa, United States of America

Tech stack

Query Performance
Third Normal Form
Airflow
Google BigQuery
Code Review
Databases
Continuous Integration
Directed Acyclic Graph (Directed Graphs)
Data Architecture
Information Engineering
Data Governance
ETL
Data Structures
Data Systems
Data Vault Modeling
Data Warehousing
Relational Databases
Software Design Patterns
Distributed Systems
Memory Management
Hadoop
Hive
Python
PostgreSQL
Online Analytical Processing
Online Transaction Processing
Oracle Applications
SQL Databases
Freeform SQL
Sql Optimization
Snowflake
Database Optimization
Spark
Indexer
PySpark
Low Latency
Optimization Algorithms
Star Schema
Kafka
Stream Processing
Data Pipelines
Control M

Job description

  • Architecture & Data Modeling: Design and implement robust logical and physical data models for both transactional (OLTP) and analytical (OLAP) workloads. Lead the transition from legacy data structures to modern, scalable cloud/hybrid

architectures.

  • Pipeline Engineering: Architect, build, and deploy highly scalable data pipelines using PySpark to process massive volumes of complex financial data with low latency.
  • Advanced SQL & Database Optimization: Write, tune, and optimize complex SQL queries. Troubleshoot query performance bottlenecks and implement data partitioning, indexing, and clustering strategies.
  • Technical Leadership: Serve as a Subject Matter Expert (SME) for the data platform. Review code, establish CI/CD best practices for data engineering, and ensure all design adheres to the overall architectural blueprint.
  • Stakeholder Collaboration: Partner with product managers, business analysts, and downstream consumers (Data Science and BI teams) to translate complex financial business requirements into technical deliverables.
  • Risk & Compliance: Appropriately assess risk when architectural decisions are made, demonstrating consideration for the firm's reputation and safeguarding Client data by driving compliance with applicable data governance laws, rules, and

regulations.

#CT1

  • Only those lawfully authorized to work in the designated country associated with the position will be considered.

  • Please note that all Position start dates and duration are estimates and may be reduced or lengthened based upon a client's business needs and requirements.

Requirements

Do you have experience in System design?, Do you have a Bachelor's degree?, Must Have Skills/Attributes: Banking/Financial, Data Modeling, ETL, Hadoop, Oracle, PySpark, Spark, SQL Experience Desired: Overall software/data engineering experience (10+ yrs); Operating at a senior, lead, or architectural level within a large enterprise (4-5 yrs); Experience navigating data governance & data quality frameworks in heavily regulated banking field (4+ yrs) Preferred Education: Bachelor's Degree C2C is not available, * Bachelor's Degree, * Experience: 10+ years of overall software/data engineering experience, with a minimum of 4-5 years operating at a senior, lead, or architectural level within a large enterprise (financial services experience is highly preferred).

  • PySpark Mastery: Production-level expertise in Apache Spark using Python (PySpark). Must understand Spark internals (DAGs, shuffling, memory management, and optimization techniques).
  • Data Modeling: Proven track record of building complex data models from scratch (Star/Snowflake schemas, Data Vault, or 3NF). Experience using data modeling tools (e.g., Erwin, Hackolade, or similar).
  • Database & SQL: Expert-level proficiency in SQL. Extensive hands-on experience with massive relational databases (e.g., Oracle, PostgreSQL) and modern data warehouses/lakes (e.g., Snowflake, BigQuery, or Hive/Hadoop).
  • Systems Design: Clear understanding of distributed systems processing, ETL/ELT design patterns, and enterprise data warehousing principles.
  • Communication: Demonstrated ability to translate complex technical concepts into clear, concise language for non-technical stakeholders and business leaders.

Preferred Qualifications/Skills/Experience:

  • Familiarity with modern orchestration tools (Airflow, Control-M).
  • Experience with real-time data streaming (Kafka).
  • Prior experience navigating data governance and data quality frameworks within a heavily regulated banking environment.

We are seeking a highly experienced PySpark Developer to lead the design, architecture, and development of mission-critical data pipelines and enterprise data models.

  • As a senior technical leader, you will bridge the gap between complex business requirements and highly scalable data architecture.
  • The ideal candidate possesses deep expertise in PySpark, advanced SQL optimization, and enterprise data modeling.
  • You will not only be a hands-on technical contributor but also serve as an architectural guide, mentoring junior developers, establishing best practices, and ensuring that data solutions are highly performant, resilient, and aligned with Client

global technology standards.

Benefits & conditions

3.83.8 out of 5 stars Tampa, FL 33610 Hybrid work $65 - $75 an hour - Temp-to-hire

Apply for this position