Data Engineer
Role details
Job location
Tech stack
Job description
The Senior Data Engineer supports enterprise data warehousing and analytics initiatives in a Databricks and Azure SQL environment. This role focuses on designing, building, and maintaining scalable and reliable data pipelines and analytics-ready data models that enable reporting and data-driven decision-making across the organization. The Senior Data Engineer partners closely with BI developers, product owners, and business stakeholders to ensure data is accurate, governed, and aligned with enterprise standards while meeting compliance requirements., * Design, build, and maintain scalable and reliable data pipeline architectures and patterns.
- Assemble large and complex data sets that meet functional and non-functional business requirements.
- Identify and implement process improvements, including automation of manual workflows, optimization of data delivery, and infrastructure enhancements for scalability.
- Develop ETL and ELT pipelines using Databricks and Azure services to ingest, transform, and load data from diverse source systems.
- Design and implement analytics-ready data models to support reporting and downstream consumption.
- Create automated data quality checks and tests to monitor the reliability and integrity of data assets.
- Partner with stakeholders to support data-related technical issues and evolving data infrastructure needs.
- Ensure secure data separation across Azure regions while maintaining compliance with HIPAA, HITECH, and other applicable regulations.
Requirements
Education: A bachelor's degree in computer science, statistics, informatics, information systems, or a related quantitative field is required.
Experience: This role requires 10 or more years of experience in data engineering or BI development roles. Experience should include working with data lake and lakehouse architectures, designing pipeline architectures, and developing/maintaining APIs and web services.
Technical Skills: Proficiency is required in SQL, including complex query development and performance tuning. Hands-on experience is necessary with Azure services (Databricks, Azure Data Factory, Azure SQL Database, Azure Data Lake Storage Gen2, PySpark, Logic Apps), Databricks Delta tables, Spark declarative pipelines, and programming languages such as Python or Scala.
Knowledge, Skills, and Abilities
- Advanced SQL expertise with a focus on performance tuning and optimization.
- Experience building large-scale ETL and ELT pipelines using Databricks and Apache Spark.
- Understanding of data lake and lakehouse concepts, including incremental ingestion, partitioning strategies, schema enforcement, and Delta Lake best practices.
- Ability to design and implement dimensional data models (star and snowflake schemas) to support Power BI and analytics use cases.
- Experience managing metadata, data lineage, dependencies, and orchestration for reliable pipelines.
- Analytical and troubleshooting skills, including root cause analysis.
- Ability to collaborate with data architects, DBAs, BI developers, and data scientists.
- Proactive approach to identifying data quality risks, pipeline failures, and performance bottlenecks.
- Attention to detail with a focus on documentation, governance, version control, and auditability.
- Ability to work in an Agile, cross-functional environment while managing multiple priorities.
- Willingness and ability to mentor junior engineers and promote Databricks and Spark best practices.