Databricks Engineer

Cyber Sphere LLC

Atlanta, United States of America

6 days ago

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Experience level

Senior

Job location

Atlanta, United States of America

Tech stack

Amazon Web Services (AWS)

Business Analytics Applications

Big Data

Cloud Engineering

Data Governance

Data Integrity

ETL

Dimensional Modeling

Distributed Computing Environment

Hive

Python

Performance Tuning

Role-Based Access Control

Cloud Services

Azure

SQL Databases

Data Streaming

Enterprise Data Management

Azure

Data Processing

Feature Engineering

Azure

Delivery Pipeline

Snowflake

Spark

Multi-Cloud

GIT

Data Lake

PySpark

Deployment Automation

Kafka

Data Management

Machine Learning Operations

Video Streaming

Azure

Data Pipelines

Key Vault

Databricks

Job description

Design, develop, and maintain ETL/ELT pipelines using Databricks (PySpark/SQL) for batch and streaming workloads.
Build scalable data processing solutions leveraging Spark (RDD, DataFrames, Datasets).
Develop and optimize Delta Lake tables, medallion architecture, and data quality frameworks.
Implement CI/CD pipelines for Databricks notebooks, jobs, and workflows.
Integrate Databricks with Azure Data Factory, ADLS Gen2, Synapse, Event Hubs, and other cloud services.
Configure and manage Databricks clusters, Unity Catalog, permissions, and workspace governance.
Collaborate with data architects, analysts, and business stakeholders to translate requirements into technical solutions.
Ensure data reliability, performance tuning, and cost optimization across the platform.
Implement security best practices including RBAC, Key Vault integration, and data encryption.
Troubleshoot production issues and support operational workloads.

Requirements

Databricks Engineer to design, build, and optimize large scale data pipelines and analytics solutions using Azure Databricks, Spark, and modern cloud data platforms. The ideal candidate has strong experience in distributed data processing, ETL/ELT development, and cloud-native engineering practices., * Strong hands-on experience with Azure Databricks, PySpark, and Spark SQL.

Proficiency in Python, SQL, and distributed data processing.
Experience with Azure Data Factory, ADLS, Synapse, or similar cloud data services.
Knowledge of Delta Lake, ACID transactions, schema evolution, and time travel.
Understanding of data modeling, including dimensional modeling and SCD patterns.
Experience with Git, DevOps pipelines, and automated deployments.
Familiarity with streaming technologies (Structured Streaming, Event Hubs, Kafka).

Preferred Qualifications

Experience with Unity Catalog, Delta Sharing, or Databricks governance frameworks.
Background in Snowflake, AWS, or multi-cloud environments.
Exposure to MLflow, feature engineering, or machine learning pipelines.
Industry experience in healthcare, finance, utilities, or enterprise data platforms.

Certifications (Preferred)