Data/Scala/Spark Engineering Specialist

Anagha Techno Soft

New York, United States of America

2 days ago

Role details

Contract type

Temporary contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Experience level

Senior

Job location

New York, United States of America

Tech stack

API

Airflow

Unit Testing

CA Workload Automation Ae

Azure

Continuous Integration

Information Engineering

ETL

Github

Python

Networking Basics

Performance Tuning

Cloud Services

SQL Databases

Teradata

Snowflake

Spark

Star Schema

Serverless Computing

Databricks

Artifactory

Job description

We're migrating complex on-prem regulatory reporting pipelines from a legacy ETL + Autosys + SQL + Teradata stack to a modern Databricks + Snowflake platform on Azure. The role is hands-on: design, implement, test, and reconcile production pipelines feeding regulatory reports under strict parity requirements.

Requirements

Scala / Spark production experience writing Spark applications in Scala (not just notebooks); comfortable with the Data Frame API, joins, window functions, partitioning, and performance tuning Databricks Serverless compute, Unity Catalog, Asset Bundles, Databricks CLI SQL fluency comfortable writing, analyzing and extracting requirements from complex SQL scripts Snowflake schema design, performance, Spark-Snowflake connector Azure ADLS, networking basics, secrets/identity (Entra ID / managed identities) Orchestration Airflow (DAG authoring, sensors, retries, SLAs) CI/CD Artifactory, GitHub Actions pipelines: build, sharded test matrices, artifact promotion through dev QA UAT prod Testing Experience in TDD, writing unit tests (ScalaTest, AnyFlatSpec) and BDD (Concordion or equivalent) Data quality & reconciliation building automated parity checks against legacy outputs, drift detection, row-level reconciliation tooling Large-scale migrations proven track record migrating legacy ETL (Autosys/Informatica/etc.) to cloud data platforms, including dependency mapping and cutover planning Modern data engineering practices medallion architecture (Bronze/Silver/Gold), idempotent pipelines, schema evolution, lineage, observability

Nice-to-have

Financial services / regulatory reporting domain Python (Databricks utilities, tooling) Spec-driven development workflows (specs plans tasks implementation)

Role details

Job location

Tech stack

Job description

Requirements

Apply for this position

Good distractions

Moments

Videos View all