Python Engineer / PySpark Data Engineer

BCforward

Jersey City, United States of America

4 days ago

Role details

Contract type

Temporary contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Experience level

Senior

Compensation

$ 153K

Job location

Jersey City, United States of America

Tech stack

Agile Methodologies

Azure

Cloudera Impala

Continuous Integration

Data Validation

Information Engineering

Software Debugging

Hadoop

Hadoop Distributed File System

Hive

Python

Performance Tuning

Cloudera

Secure Coding

Data Streaming

Management of Software Versions

Data Logging

Data Ingestion

Spark

Virtual Environment

PySpark

Git Flow

Integration Tests

Jenkins

Databricks

Job description

Design and implement batch and streaming data ingestion and transformation jobs using Python and PySpark.
Build reusable frameworks for data quality checks, schema management, and error handling with retry logic.
Integrate pipelines with CI/CD processes and Git workflows with artifact versioning.
Apply secure coding practices, protect secrets and PII, and ensure compliance.
Tune performance using partitioning, caching, broadcast joins, and memory configuration.
Implement observability with structured logging, metrics, and tracing for distributed debugging.
Collaborate with architects and Data-Ops to deliver robust and compliant solutions with clear documentation.

Requirements

We are seeking a Python Engineer to join our dynamic team. The ideal candidate will have strong experience in Python, PySpark/Spark, and data engineering on Cloudera/Hadoop and Databricks and a proven ability to design, implement, and optimize secure, observable, and testable data ingestion and transformation pipelines., * Strong Python, including packaging and virtual environments.

PySpark/Spark with demonstrated performance tuning expertise.
Data ingestion for batch and streaming, schema management, error handling, and retry logic.
Test discipline across unit and integration tests, data quality assertions, and reproducible pipelines.
CI/CD using Azure DevOps or Jenkins, Git workflows, artifact versioning, and release readiness.
Experience on Cloudera/Hadoop (HDFS, Spark, Hive/Impala) and Databricks (clusters, jobs, notebooks, Delta).
Observability with structured logging, metrics, tracing, and debugging in distributed contexts.
Secure coding, secret management, PII protection, and compliance awareness.
Strong communication and collaborative work style with documentation of frameworks and patterns.
Minimum 5 years of relevant experience.

Preferred Skills:

Experience with Agile ceremonies and iterative delivery.
Experience building reusable data quality and pipeline frameworks.

Benefits & conditions

Competitive compensation and benefits.
Opportunities for growth with global clients.
A supportive, inclusive culture that values innovation and people.
Exposure to cutting-edge technologies and projects.

About the company

BCforward is a leading global IT consulting and workforce solutions firm providing services and support to Fortune 500 and government clients. Founded in 1998, BCforward has grown with our customers needs into a full-service business solutions provider. With delivery centers and offices across North America and India, we take pride in building long-term relationships and delivering excellence through innovation, collaboration, and integrity.

Role details

Job location

Tech stack

Job description

Requirements

Benefits & conditions

About the company

Apply for this position

Good distractions

Moments

Videos View all