Data Engineer
Role details
Job location
Tech stack
Job description
Development
Contribute development efforts for ETL pipelines in the Enterprise Data Warehouse (EDW)
Support and rebuild legacy ETL jobs (currently not using ACID transactions) with modern solutions using Apache Spark and Apache Iceberg to support ACID transactions
Transform and integrate EBCDIC Mainframe data into Hive and Impala tables using Precisely Connect for Big Data
Optimize data transformation processes for performance, scalability, and reliability
Ensure data consistency, accuracy, and quality across the ETL pipelines
Utilize best practices for ETL code development, version control, and deployment using Azure DevOps
Production Support
Share weekly 24/7 production support with managed service vendor on a 4-week rotation
Monitor ETL workflows and troubleshoot issues to ensure smooth production operations
Research and resolve user requests and issues
Collaboration and Stakeholder Engagement
Collaborate with cross-functional teams, including data engineers, business analysts, administrators, and quality analyst engineers to ensure alignment on requirements and deliverables
Engage with business stakeholders to understand data requirements and translate them into scalable technical solutions
Technical Governance
Contribute to process documentation, and follow best practices within the Enterprise Data Warehouse
Follow proper SDLC protocols within Azure DevOps code repository
Stay updated on emerging technologies and trends to continuously improve data platform capabilities
Other tasks as assigned by management
Requirements
Bachelor's degree in IT or similar field (Additional equivalent experience above the required minimum may be substituted for the degree requirement.)
3+ years of experience in ETL development and data engineering roles
3+ years of advanced SQL experience
3+ years in Python and Linux for Spark-based development
Proven experience in using Apache Spark or Apache Iceberg or Airflow for ETL pipelines
Strong familiarity with version control systems, especially Azure DevOps
Knowledge of data governance and security best practices in a distributed data environment
Familiarity with data modeling, schema design, and building data models for reporting needs
In-depth understanding of ETL frameworks, ACID transactions, change data capture, and distributed computing
Experience in designing and managing large-scale data pipelines and workflows
Excellent problem-solving and troubleshooting skills
Effective communication and collaboration abilities to collaborate with diverse teams and stakeholders
Timeline centric mindset
Enterprise application awareness and technical alignment standards
Some travel may be required
Preferred Qualifications:
Experience with Cloudera Data Platform (CDP), including Hive and Impala
Knowledge of Precisely Connect for Big Data or similar tools for mainframe data transformation
Benefits & conditions
The salary range for this role is $120,000 to $170,000 or the hourly equivalent. Pay is based on several factors including but not limited to education, work experience, certifications, etc. In addition to your salary, Turnberry Solutions offers benefits such as a comprehensive healthcare package (medical, dental, vision), disability and group term life insurance, health and flexible spending accounts, a utilization bonus, 401(k) with match, flexible time off for salaried employees, parental leave for salaried employees, and flexible work arrangements (all benefits are subject to eligibility requirements). No matter where or when you begin a career with Turnberry, you'll find a far-reaching choice of benefits and incentives.