Data Engineer with SQL and Databricks Exp.-In Person Interview
Role details
Job location
Tech stack
Requirements
-
ETL / ELT Concepts: Strong understanding of pipeline patterns, incremental loads, data validation, and troubleshooting.
-
SQL: Advanced querying (CTEs, views, joins, complex query logic) and performance tuning for transformations and validation. Python: Production-quality development (modular code, testing, logging, integration with APIs/files, CICD, Unit Test/Integration test automation, Code Coverage).
-
PySpark: Distributed transformations and performance optimization (joins, partitions, debugging), CICD, Unit Test/Integration test automation, Code Coverage.
-
Azure Data Factory (ADF): Build/operate ADF pipelines, parameterization, triggers, monitoring, retry/error handling; integrate with Databricks/ADLS.
-
Databricks: Develop and operationalize notebooks/jobs/workflows; Delta Lake patterns; basic cluster/job configuration.
-
Azure Fundamentals + Pulumi: Hands-on with ADLS Gen2, Azure Portal, Storage Explorer, Resource Groups, Azure SQL, and familiarity integrating with Azure OpenAI. Able to use/maintain Pulumi scripts for provisioning and managing Azure resources across environments
Nice to have skills: -
-
Ability to support/translate validation rules with SQL scripts and create data quality reports. TypeScript: Useful for pulumi pipeline to create Azure components.
-
Java: Useful for integration with existing services/components.
-
.NET: Useful for integration with existing services/components. Angular / Spring Boot: Minor troubleshooting or coordination with app teams.