Data Engineer
Role details
Job location
Tech stack
Job description
You will work closely with internal stakeholders and external data providers to manage periodic data loads, schema changes, and historical backfills. Strong SQL expertise is essential, along with experience supporting static and incrementally growing datasets in a massively parallel processing (MPP) environment., * Design, develop, and maintain ETL/ELT pipelines for large, structured datasets
- Own and support 3-5 primary datasets, handling scheduled loads (monthly, quarterly, or annual) and schema updates
- Perform full reloads and historical backfills as data models evolve
- Collaborate with internal partners and third-party data providers to ensure accurate, timely data delivery
- Ensure pipelines are scalable, repeatable, secure, and well-documented
- Implement and enforce data quality, validation, and integrity controls
- Conduct regular and ad hoc data audits, reviews, and investigations
- Produce technical documentation for data structures, processes, and standards
- Identify opportunities to improve data ingestion, modeling, and quality processes
- Support PostgreSQL schemas in a Greenplum (MPP) data warehouse environment
- Participate in Agile/DevOps workflows, working independently or as part of a database/data platform team
Data Environment
- PostgreSQL data warehouse running on Greenplum MPP architecture
- Approximately 25 heterogeneous datasets totaling 300+ TB
- Data domains include financial, credit, mortgage, risk, real estate, and market data
- Some datasets contain billions of rows and grow incrementally over time
- Data is generally static between loads, with full reloads performed on refresh
- Mix of manual and automated loads, primarily driven by Linux and Bash scripting
- Production and contingency environments are mirrored for resiliency
- Limited downstream BI usage (e.g., select Tableau feeds)
Requirements
- Advanced SQL (complex querying, performance tuning, data validation)
- PostgreSQL data modeling and schema management
- ETL / ELT pipeline development
- Data warehousing concepts and best practices
- Linux / Bash scripting
- Git and GitLab version control workflows
- Experience working with large-scale, structured enterprise datasets
Preferred / Nice-to-Have Skills
- Greenplum or other MPP database platforms
- Python for data processing or automation
- Exposure to Databricks environments
- Familiarity with cloud platforms (AWS)
- Tableau or other BI/reporting tools, * Bachelor's degree in a STEM-related field (or equivalent practical experience)
- 4-10+ years of relevant data engineering experience (level dependent)
- Proven experience developing and maintaining complex SQL across relational databases
- Strong understanding of data quality, governance, and validation practices
- Ability to work independently with minimal supervision
- Senior-level candidates should demonstrate subject-matter depth and serve as a technical resource to others
Benefits & conditions
This is a Contract position based out of Kansas City, MO. Pay and Benefits The pay range for this position is $70.00 - $85.00/hr. Eligibility requirements apply to some benefits and may depend on your job classification and length of employment. Benefits are subject to change and may be subject to specific elections, plan, or program terms. If eligible, the benefits available for this temporary role may include the following: Medical, dental & vision Critical Illness, Accident, and Hospital 401(k) Retirement Plan - Pre-tax and Roth post-tax contributions available Life Insurance (Voluntary Life & AD&D for the employee and dependents) Short and long-term disability Health Spending Account (HSA) Transportation benefits Employee Assistance Program Time Off/Leave (PTO, Vacation or Sick Leave)