Databricks Architect/Administrator
Role details
Job location
Tech stack
Job description
Our firm is partnering with a large, highly regulated enterprise in the insurance and financial services sector to identify a senior Databricks Architect / Administrator. This role serves as a technical leader responsible for the design, governance, and ongoing optimization of an enterprise-scale Databricks platform that supports analytics, data engineering, and advanced AI/ML use cases. This position operates as a senior individual contributor and works closely with data engineering, analytics, infrastructure, and security teams. The environment supports a multi-platform data ecosystem, including both modern cloud-native ingestion tools and established enterprise ETL technologies. A strong foundation in Unix/Linux systems administration is essential due to the depth of interaction with the underlying compute and operating environments., The Databricks Architect / Administrator will own platform architecture, operational reliability, cost optimization, and governance for Databricks across the enterprise. This individual is the primary technical authority for Databricks and a key partner in shaping the platform roadmap. Platform Architecture & Design
- Architect and govern the enterprise Databricks environment, including workspace topology, Unity Catalog structure, and access control models.
- Define and enforce standards for cluster configurations, runtime versions, instance pools, auto-scaling, and compute strategies.
- Design scalable and performant data pipeline patterns using Delta Live Tables, Databricks Workflows, and structured streaming.
- Establish and maintain Delta Lake standards, including table design, partitioning strategies, Z-ordering, and OPTIMIZE/VACUUM schedules.
- Lead integration design with upstream ingestion platforms, ensuring reliable, governed data delivery.
Unix/Linux Infrastructure & Operations
- Administer and troubleshoot Unix/Linux-based environments that underpin Databricks compute and cluster lifecycle operations.
- Develop and maintain Bash and Python automation for platform monitoring, maintenance, and operational workflows.
- Manage file systems, permissions, and data movement across Linux-based storage and compute layers.
- Partner with infrastructure and cloud teams on VM-level diagnostics, tuning, and performance optimization.
Cost Management & Optimization
- Track and report Databricks usage and consumption; identify cost optimization opportunities across workloads and environments.
- Implement cost attribution and reporting models to support showback or chargeback by team or business unit.
- Support capacity planning, forecasting, and long-term commitment utilization in partnership with senior leadership.
Governance, Security & Compliance
- Design and implement governance frameworks within Unity Catalog, including lineage, tagging, and access auditing.
- Collaborate with cybersecurity teams to meet enterprise security controls such as secrets management, encryption, and network isolation.
- Support audit and compliance efforts through documentation of configurations, access policies, and data standards.
Automation & Artificial Intelligence
- Build automation frameworks for cluster lifecycle management, job orchestration, alerting, and self-healing workflows.
- Enable machine learning operations using Databricks AutoML, MLflow, and Model Serving.
- Integrate AI-assisted development tools into engineering workflows to accelerate delivery and reduce manual overhead.
- Partner with data science teams on scalable feature engineering, model deployment patterns, and AI-enabled data pipelines.
- Evaluate and recommend new Databricks features, partner integrations, and AI/ML capabilities aligned to enterprise strategy.
Platform Leadership & Support
- Serve as the primary technical escalation point for Databricks-related issues across engineering and analytics teams.
- Participate in sprint planning and platform change management using tools such as Jira and ServiceNow.
- Produce and maintain platform documentation, architectural designs, runbooks, and onboarding materials.
Requirements
- 7+ years of experience in data engineering or data platform roles, including at least 4 years of hands-on Databricks experience.
- Deep expertise with Databricks services such as Unity Catalog, Delta Lake, Databricks Workflows, Delta Live Tables, and SQL Warehouses.
- Strong Unix/Linux experience, including shell scripting, process management, file systems, scheduling, and environment configuration.
- Proficiency in Python and PySpark for distributed data processing and automation.
- Experience working with cloud platforms (AWS, Azure, or GCP), including compute, storage, networking, and identity/security services.
- Proven ability to design scalable, cost-efficient, and reliable enterprise data platforms.
- Hands-on experience building automation for orchestration, monitoring, alerting, and platform self-healing.
- Working knowledge of AI/ML tooling in the Databricks ecosystem, including MLflow and AutoML; exposure to generative AI is beneficial.
- Experience integrating with Oracle databases, including SQL development and data extraction patterns.
- Proficiency with Git-based version control, CI/CD pipelines, and collaborative development workflows.
- Familiarity with IT service management and delivery frameworks such as ServiceNow and Jira.
- Strong communication skills with the ability to explain complex technical concepts to diverse audiences., * Hands-on experience with MLflow experiment tracking, model registries, and production deployment.
- Exposure to generative AI frameworks (e.g., LangChain, LlamaIndex) or LLM-enabled data workflows such as RAG pipelines.
- Experience with enterprise workflow orchestration platforms such as Databricks Workflows or Apache Airflow.
- Background integrating Databricks with ETL/ELT platforms; experience with Fivetran or Ab Initio is a strong plus.
- Familiarity with enterprise data governance tools and practices.
- Experience supporting data platforms in regulated industries such as insurance or financial services.
- Knowledge of Infrastructure-as-Code tools (e.g., Terraform, Ansible).
- Experience designing disaster recovery and resiliency strategies for cloud-based data platforms.
Benefits & conditions
Dahl Consulting is proud to offer a comprehensive benefits package to eligible employees that will allow you to choose the best coverage to meet your family's needs. For details, please review the DAHL Benefits Summary: https://www.dahlconsulting.com/benefits-w2fta/.