Python Engineer / PySpark Data Engineer
BCforward
Jersey City, United States of America
4 days ago
Role details
Contract type
Temporary contract Employment type
Full-time (> 32 hours) Working hours
Regular working hours Languages
English Experience level
Senior Compensation
$ 153KJob location
Jersey City, United States of America
Tech stack
Agile Methodologies
Azure
Cloudera Impala
Continuous Integration
Data Validation
Information Engineering
Software Debugging
Hadoop
Hadoop Distributed File System
Hive
Python
Performance Tuning
Cloudera
Secure Coding
Data Streaming
Management of Software Versions
Data Logging
Data Ingestion
Spark
Virtual Environment
PySpark
Git Flow
Integration Tests
Jenkins
Databricks
Job description
- Design and implement batch and streaming data ingestion and transformation jobs using Python and PySpark.
- Build reusable frameworks for data quality checks, schema management, and error handling with retry logic.
- Integrate pipelines with CI/CD processes and Git workflows with artifact versioning.
- Apply secure coding practices, protect secrets and PII, and ensure compliance.
- Tune performance using partitioning, caching, broadcast joins, and memory configuration.
- Implement observability with structured logging, metrics, and tracing for distributed debugging.
- Collaborate with architects and Data-Ops to deliver robust and compliant solutions with clear documentation.
Requirements
We are seeking a Python Engineer to join our dynamic team. The ideal candidate will have strong experience in Python, PySpark/Spark, and data engineering on Cloudera/Hadoop and Databricks and a proven ability to design, implement, and optimize secure, observable, and testable data ingestion and transformation pipelines., * Strong Python, including packaging and virtual environments.
- PySpark/Spark with demonstrated performance tuning expertise.
- Data ingestion for batch and streaming, schema management, error handling, and retry logic.
- Test discipline across unit and integration tests, data quality assertions, and reproducible pipelines.
- CI/CD using Azure DevOps or Jenkins, Git workflows, artifact versioning, and release readiness.
- Experience on Cloudera/Hadoop (HDFS, Spark, Hive/Impala) and Databricks (clusters, jobs, notebooks, Delta).
- Observability with structured logging, metrics, tracing, and debugging in distributed contexts.
- Secure coding, secret management, PII protection, and compliance awareness.
- Strong communication and collaborative work style with documentation of frameworks and patterns.
- Minimum 5 years of relevant experience.
Preferred Skills:
- Experience with Agile ceremonies and iterative delivery.
- Experience building reusable data quality and pipeline frameworks.
Benefits & conditions
- Competitive compensation and benefits.
- Opportunities for growth with global clients.
- A supportive, inclusive culture that values innovation and people.
- Exposure to cutting-edge technologies and projects.
About the company
BCforward is a leading global IT consulting and workforce solutions firm providing services and support to Fortune 500 and government clients. Founded in 1998, BCforward has grown with our customers needs into a full-service business solutions provider. With delivery centers and offices across North America and India, we take pride in building long-term relationships and delivering excellence through innovation, collaboration, and integrity.