Data Engineer
Role details
Job location
Tech stack
Job description
As a Data Engineer, you will be responsible for developing and maintaining data pipelines across the medallion architecture (bronze, silver, gold) using Microsoft Fabric and Azure technologies. You'll work closely with senior engineers, analysts, and business stakeholders to translate requirements into reliable, scalable data engineering solutions. This role requires strong hands-on experience with Python, PySpark, and data pipeline orchestration tools, as well as a focus on data quality, performance optimization, and operational excellence., * Build and maintain data ingestion pipelines using Microsoft Fabric (Pipelines, Dataflows Gen2, Notebooks) and Azure Data Factory
- Develop Python and PySpark notebooks to cleanse, transform, and enrich data across medallion architecture layers
- Implement Delta Lake patterns including partitioning, schema evolution, and incremental loads
- Apply data quality checks including null handling, duplicate detection, schema validation, and SCD logic
- Develop reusable, metadata-driven, and parameterized pipeline frameworks
- Troubleshoot pipeline failures, performance issues, and data discrepancies; perform root-cause analysis
- Participate in code reviews, testing, and CI/CD processes using Git-based workflows
- Support gold-layer datasets and semantic models for Power BI consumption
- Monitor pipelines and resolve operational issues within defined SLAs; support on-call rotations as needed
- Maintain technical documentation such as data dictionaries, mappings, and runbooks
- Collaborate cross-functionally to deliver high-quality data engineering solutions, * Opportunity to work with modern data technologies including Microsoft Fabric and Azure
- Collaborative environment with experienced data engineering professionals
- Exposure to enterprise-level data, analytics, and AI initiatives
- Growth opportunities through hands-on learning and ownership of impactful projects
Requirements
Aptean is seeking a hands-on Data Engineer to build, test, and maintain scalable data ingestion and transformation pipelines on Microsoft Fabric. This role is execution-focused and centered on delivering high-quality data solutions that support enterprise reporting, analytics, and AI use cases. The ideal candidate brings experience in modern lakehouse environments and a strong sense of ownership with a willingness to learn., * 3-5 years of experience building data pipelines in production or near-production environments
- Hands-on experience with Python and PySpark
- Experience with Azure Data Factory or similar orchestration tools
- Familiarity with Microsoft Fabric (or strong experience in Azure Synapse/Databricks with willingness to learn Fabric)
- Solid understanding of SQL, data modeling, and performance optimization
- Knowledge of medallion architecture and Delta Lake concepts
- Experience implementing data quality checks and basic SCD handling
- Familiarity with Git, branching, and CI/CD practices (Azure DevOps or GitHub Actions)
- Exposure to Power BI and semantic data models is a plus
- Understanding of data governance concepts such as schema versioning and access controls preferred
- Strong problem-solving skills, attention to detail, and ownership mindset
- Effective communication skills and ability to collaborate with technical and business teams