Apache Spark Developer

Absolute Business Solutions Corp

Herndon, United States of America

4 days ago

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Experience level

Senior

Job location

Herndon, United States of America

Tech stack

Artificial Intelligence

Amazon Web Services (AWS)

Business Analytics Applications

Azure

Bash

Big Data

Cloud Computing

Continuous Integration

Information Engineering

DevOps

Distributed Computing Environment

Distributed Data Store

Python

Performance Tuning

SQL Databases

Data Streaming

Parquet

Scripting (Bash/Python/Go/Ruby)

Feature Engineering

Data Ingestion

Spark

GIT

Containerization

Data Lake

PySpark

Kubernetes

Machine Learning Operations

Terraform

Stream Processing

Data Pipelines

Docker

Service Stack

Jenkins

Job description

We are actively hiring a TS/SCI-cleared Apache Spark Developer to support NGA's Data Modernization Services (DMS) mission by building and optimizing large-scale data processing pipelines. This role focuses on developing high-performance Spark applications within a containerized, Kubernetes-based environment, supporting mission analytics, data exploitation, and AI/ML integration. The ideal candidate thrives in distributed data environments, understands performance tuning deeply, and can operate effectively in secure, air-gapped systems.

This role is on-site/flexible hours in Herndon, VA; Springfield, VA; St. Louis, MO; or Aurora, CO.

Clearance Required for this role: TS/SCI eligibility with willingness/ability to obtain CI polygraph.

Core Technology Stack

Data / Processing

Apache Spark (PySpark, Scala)
Delta Lake, Parquet
Structured Streaming

Infrastructure

Kubernetes (execution environment)
Docker

Storage / Cloud (Abstracted)

S3 / object storage
AWS / GCP / Azure (environment-dependent)

DevOps (Exposure Level)

Git, Jenkins (CI/CD)

Languages

Python (PySpark)
Scala (preferred)
Bash / scripting, * Design, develop, and maintain Apache Spark pipelines (batch and streaming) using PySpark and/or Scala
Process and transform large-scale datasets using modern data lake architectures (Delta Lake, Parquet)
Optimize Spark jobs for performance, including:

o Partitioning strategies

o Shuffle optimization

o Memory tuning

o File sizing and storage efficiency

Implement Structured Streaming pipelines for near real-time data processing
Develop and deploy Spark applications within containerized environments (Docker)
Execute workloads in Kubernetes clusters, supporting scalable and distributed processing
Integrate Spark pipelines with downstream systems, including:

o Analytics platforms (SQL, notebooks)

o AI/ML workflows and feature engineering pipelines

Support data ingestion and storage in object-based systems (e.g., S3-compatible storage)
Troubleshoot data pipeline failures and ensure reliability in mission-critical environments
Operate within secure, air-gapped environments, including

Requirements

TS/SCI (eligibility) with ability/willingness to obtain/maintain counterintelligence polygraph
Bachelor's degree plus 5 years' experience in data engineering or Spark development (will entertain additional years' experience in lieu of degree)
Strong hands-on experience with:

o Apache Spark (mandatory)

o Python (PySpark)

o Data processing at scale

Experience working with:

o Parquet and/or Delta Lake

o Distributed data systems

Familiarity with:

o Docker / containerization

o Kubernetes (basic to intermediate experience)

Experience with object storage systems (e.g., S3 or equivalent)
Strong troubleshooting and performance tuning skills
Proficiency in Bash or scripting

Preferred Qualifications:

Experience with Scala for Spark development
Experience with Structured Streaming in production environments
Familiarity with Iceberg or lakehouse architectures
Experience with CI/CD pipelines (Jenkins, Git)
Exposure to Terraform or Infrastructure as Code
Experience supporting AI/ML data pipelines
Prior experience supporting NGA, IC, or DoD programs

Benefits & conditions

Some of our benefits include:

Generous PTO plus 11 Federal Holidays
Retirement Planning - 401k Fully Vested with Match
Tuition Assistance Program - Annual contributions to help you pay down your loans
Annual Health and Wellness Allowance - buy an Apple Watch, a treadmill, or hit the gym on us
Career Development - Annual Funds to spend on Education and Training
Volunteer Time Off - Annually, all employees can spend 8 hours directly supporting a charity of choice
Charitable Match - ABSC matches an employee's donation to a qualifying charity
Referral Program - We pay for internal and external referrals!
LOV Awards - Earn bonus awards throughout the year from our Living Our Values awards program

About the company

Absolute Business Solutions Corp (ABSC) is not just another tech company. We're a community of innovators, engineers, analysts and business professionals working together with our customers to tackle the most complex challenges. For more than 20 years we've supported critical DoD, IC and Federal Civilian missions and global, multi-national corporations. We specialize in supporting our clients in the Intelligence, Technology, Defense, AI/ML, and Data Science fields. As we continue to grow at a rapid pace, we are seeking some amazing new professionals to join our team., ABSC is a technology and services company that combines the agility of a small business with proven processes refined over more than two decades in business. We specialize in supporting public sector clients in the Intelligence, Defense, Health, and Safety areas. Our team stands ready to deliver the next generation of programs, personnel, and solutions to help advance our federal government customers' driving innovation, agility, and security across all mission areas.