Databricks Engineer
Role details
Job location
Tech stack
Job description
Design, develop, and maintain scalable end-to-end data pipelines using Databricks, Delta Lake, PySpark, SQL, and AWS.
Build batch, real-time, and streaming data solutions using Kafka, AWS Kinesis, and related technologies.
Implement data ingestion, transformation, curation, and storage frameworks within a Databricks Lakehouse architecture.
Optimize Spark jobs, troubleshoot production issues, and ensure data quality, governance, lineage, and reliability.
Develop secure, scalable, and cost-efficient solutions leveraging AWS S3, Glue, Lambda, Kinesis, and Redshift.
Implement security controls including RBAC, encryption, and data masking.
Follow CI/CD, DevOps, and data engineering best practices using Git and automation tools.
Collaborate with BI, Analytics, Data Science, and business stakeholders to deliver curated datasets supporting reporting and AI/ML initiatives.
Lead architecture, design, and implementation of enterprise data solutions while driving Lakehouse best practices.
Mentor engineers, conduct code reviews, establish development standards, and oversee sprint planning and delivery.
Partner with cloud, security, and platform teams to ensure compliance, governance, and operational excellence.
Support analytics initiatives using Power BI and Tableau.
Requirements
Strong hands-on experience with Databricks.
Expertise in Python, PySpark, SQL, and distributed data processing.
Extensive experience with AWS services including S3, Glue, Lambda, Kinesis, and Redshift.
Strong experience building and supporting ETL/ELT pipelines.
Deep understanding of Delta Lake and Lakehouse Architecture.
Experience with both streaming and batch processing frameworks.
Knowledge of CI/CD, Git, performance tuning, and troubleshooting.
Experience with Power BI and Tableau.
Databricks Certified Data Engineer Professional certification is mandatory.