Databricks Architect/Administrator

Dahl Consulting
Hartford, United States of America
yesterday

Role details

Contract type
Temporary contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior
Compensation
$ 114K

Job location

Hartford, United States of America

Tech stack

Artificial Intelligence
Airflow
Amazon Web Services (AWS)
Data analysis
Computing Platforms
JIRA
Build Automation
Azure
Bash
Continuous Integration
Information Engineering
Data Governance
Data Infrastructure
ETL
Data Mining
Dataspaces
Database Development
Linux
Disaster Recovery
File Systems
Distributed Computing Environment
Python
Key Management
Machine Learning
Oracle
Performance Tuning
Scrum
Ansible
Runbook
Shell Script
Software Deployment
SQL Databases
Systems Integration
Enterprise Data Management
Cloud Platform System
Feature Engineering
Autoscaling
Large Language Models
Ab Initio
Data Lake
PySpark
Infrastructure Automation Frameworks
Integration Frameworks
Data Management
Machine Learning Operations
Data Delivery
Terraform
Multiplatform
Software Version Control
Data Pipelines
ServiceNow
Databricks

Job description

Our firm is partnering with a large, highly regulated enterprise in the insurance and financial services sector to identify a senior Databricks Architect / Administrator. This role serves as a technical leader responsible for the design, governance, and ongoing optimization of an enterprise-scale Databricks platform that supports analytics, data engineering, and advanced AI/ML use cases. This position operates as a senior individual contributor and works closely with data engineering, analytics, infrastructure, and security teams. The environment supports a multi-platform data ecosystem, including both modern cloud-native ingestion tools and established enterprise ETL technologies. A strong foundation in Unix/Linux systems administration is essential due to the depth of interaction with the underlying compute and operating environments., The Databricks Architect / Administrator will own platform architecture, operational reliability, cost optimization, and governance for Databricks across the enterprise. This individual is the primary technical authority for Databricks and a key partner in shaping the platform roadmap. Platform Architecture & Design

  • Architect and govern the enterprise Databricks environment, including workspace topology, Unity Catalog structure, and access control models.
  • Define and enforce standards for cluster configurations, runtime versions, instance pools, auto-scaling, and compute strategies.
  • Design scalable and performant data pipeline patterns using Delta Live Tables, Databricks Workflows, and structured streaming.
  • Establish and maintain Delta Lake standards, including table design, partitioning strategies, Z-ordering, and OPTIMIZE/VACUUM schedules.
  • Lead integration design with upstream ingestion platforms, ensuring reliable, governed data delivery.

Unix/Linux Infrastructure & Operations

  • Administer and troubleshoot Unix/Linux-based environments that underpin Databricks compute and cluster lifecycle operations.
  • Develop and maintain Bash and Python automation for platform monitoring, maintenance, and operational workflows.
  • Manage file systems, permissions, and data movement across Linux-based storage and compute layers.
  • Partner with infrastructure and cloud teams on VM-level diagnostics, tuning, and performance optimization.

Cost Management & Optimization

  • Track and report Databricks usage and consumption; identify cost optimization opportunities across workloads and environments.
  • Implement cost attribution and reporting models to support showback or chargeback by team or business unit.
  • Support capacity planning, forecasting, and long-term commitment utilization in partnership with senior leadership.

Governance, Security & Compliance

  • Design and implement governance frameworks within Unity Catalog, including lineage, tagging, and access auditing.
  • Collaborate with cybersecurity teams to meet enterprise security controls such as secrets management, encryption, and network isolation.
  • Support audit and compliance efforts through documentation of configurations, access policies, and data standards.

Automation & Artificial Intelligence

  • Build automation frameworks for cluster lifecycle management, job orchestration, alerting, and self-healing workflows.
  • Enable machine learning operations using Databricks AutoML, MLflow, and Model Serving.
  • Integrate AI-assisted development tools into engineering workflows to accelerate delivery and reduce manual overhead.
  • Partner with data science teams on scalable feature engineering, model deployment patterns, and AI-enabled data pipelines.
  • Evaluate and recommend new Databricks features, partner integrations, and AI/ML capabilities aligned to enterprise strategy.

Platform Leadership & Support

  • Serve as the primary technical escalation point for Databricks-related issues across engineering and analytics teams.
  • Participate in sprint planning and platform change management using tools such as Jira and ServiceNow.
  • Produce and maintain platform documentation, architectural designs, runbooks, and onboarding materials.

Requirements

  • 7+ years of experience in data engineering or data platform roles, including at least 4 years of hands-on Databricks experience.
  • Deep expertise with Databricks services such as Unity Catalog, Delta Lake, Databricks Workflows, Delta Live Tables, and SQL Warehouses.
  • Strong Unix/Linux experience, including shell scripting, process management, file systems, scheduling, and environment configuration.
  • Proficiency in Python and PySpark for distributed data processing and automation.
  • Experience working with cloud platforms (AWS, Azure, or GCP), including compute, storage, networking, and identity/security services.
  • Proven ability to design scalable, cost-efficient, and reliable enterprise data platforms.
  • Hands-on experience building automation for orchestration, monitoring, alerting, and platform self-healing.
  • Working knowledge of AI/ML tooling in the Databricks ecosystem, including MLflow and AutoML; exposure to generative AI is beneficial.
  • Experience integrating with Oracle databases, including SQL development and data extraction patterns.
  • Proficiency with Git-based version control, CI/CD pipelines, and collaborative development workflows.
  • Familiarity with IT service management and delivery frameworks such as ServiceNow and Jira.
  • Strong communication skills with the ability to explain complex technical concepts to diverse audiences., * Hands-on experience with MLflow experiment tracking, model registries, and production deployment.
  • Exposure to generative AI frameworks (e.g., LangChain, LlamaIndex) or LLM-enabled data workflows such as RAG pipelines.
  • Experience with enterprise workflow orchestration platforms such as Databricks Workflows or Apache Airflow.
  • Background integrating Databricks with ETL/ELT platforms; experience with Fivetran or Ab Initio is a strong plus.
  • Familiarity with enterprise data governance tools and practices.
  • Experience supporting data platforms in regulated industries such as insurance or financial services.
  • Knowledge of Infrastructure-as-Code tools (e.g., Terraform, Ansible).
  • Experience designing disaster recovery and resiliency strategies for cloud-based data platforms.

Benefits & conditions

Dahl Consulting is proud to offer a comprehensive benefits package to eligible employees that will allow you to choose the best coverage to meet your family's needs. For details, please review the DAHL Benefits Summary: https://www.dahlconsulting.com/benefits-w2fta/.

Apply for this position