Databricks Data Architect
Role details
Job location
Tech stack
Job description
We are looking for a highly skilled Databricks Architect to lead the design and technical planning for a large-scale migration to the Databricks platform. This is a strategic pre-execution role where you will drive technical architecture discussions with client stakeholders and internal teams to lay the foundation for a successful project. The role involves working closely with our leadership and the client's technical leadership to define the migration blueprint, leverage accelerators like Turgon, and prepare the Statement of Work (SOW) for execution.
Responsibilities
Act as the technical lead and engage with client architects and engineers to understand the current ecosystem and design the target state on Databricks.
Define the architecture for data ingestion, ETL/ELT pipelines, data lakehouses, and advanced analytics using the Databricks Lakehouse Platform.
Evaluate existing systems (eg, AWS Redshift, Spark on EMR, GKE, Kafka) and design migration strategies to Databricks with minimal disruption.=
Develop architecture artifacts including data flow diagrams, integration patterns, and component-level designs.
Identify reusable components, best practices, and automation opportunities to accelerate delivery using tools such as Turgon.
Collaborate with infrastructure, security, and DevOps teams to define cluster sizing, deployment models, CI/CD pipelines, and cost optimization.
Support SOW and project scoping discussions with effort estimations and risk identification.
Stay engaged through early build phases, ensuring architectural integrity during handoff to engineering teams.
Requirements
8-10+ years of experience in data engineering and architecture, with at least 3+ years on Databricks.
Deep knowledge of Databricks Lakehouse Architecture, including Unity Catalog, Delta Lake, DBFS, MLflow, and structured streaming.
Strong background in building large-scale ETL/ELT pipelines using PySpark, SparkSQL, dbx, and Databricks Workflows.
Hands-on with AWS stack (S3, Redshift, EMR, Lambda) or Azure equivalents.
Experience integrating Databricks with Kafka, Airflow, CI/CD pipelines, and container orchestration platforms like Kubernetes.
Familiar with security frameworks, role-based access control, and compliance in enterprise data environments.
Strong communication and leadership skills, capable of driving discussions with senior architects and business leaders.
Nice to Have
Experience using Turgon or other internal accelerators in migration/modernization projects.
Prior experience in manufacturing, semiconductor, or hardware-driven enterprises.