Data Engineer

Kforce Inc.
St. Louis, United States of America
12 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Intermediate

Job location

St. Louis, United States of America

Tech stack

Query Performance
Airflow
Architectural Patterns
Azure
Big Data
Code Review
Computer Programming
Continuous Integration
Data Architecture
Information Engineering
Data Governance
ETL
Data Security
Data Systems
Data Vault Modeling
Dimensional Modeling
Github
Hive
Python
Key Management
SQL Azure
Performance Tuning
Azure
SQL Databases
Data Streaming
Azure
File Transfer Protocol (FTP)
Azure
Spark
GIT
Data Lake
PySpark
Git Flow
Information Technology
Collibra
Non-relational Database
Data Management
REST
Terraform
Data Pipelines
Databricks

Job description

Kforce has a client in Saint Louis, MO that is seeking a Data Engineer.

Responsibilities The Data Engineer will architect and implement end-to-end data solutions on the Databricks Unity Catalog platform. This includes designing and building ETL/ELT pipelines that ingest data from diverse sources, transform it according to business requirements, and deliver it to downstream consumers. The role requires developing and optimizing Apache Spark jobs for processing large-scale datasets efficiently, implementing data quality frameworks to ensure accuracy and reliability, and influencing reusable frameworks and libraries to accelerate development across the team.

Collaboration is central to this position. The Data Engineer will work closely with product owners to understand requirements and deliver solutions that meet organizational needs. This includes translating business requirements into technical specifications, providing guidance on data architecture and best practices, and supporting analytics teams in accessing and utilizing data effectively. Technical excellence and operational sustainability are expected. The Data Engineer will optimize query performance and resource utilization to control costs, implement comprehensive monitoring and alerting systems, maintain thorough documentation of data pipelines and processes, and ensure adherence to security policies and compliance requirements. The role also involves participating in code reviews and promoting engineering best practices throughout the data organization.

Requirements

  • Bachelor's degree in Computer Science, Engineering, or a related technical field, or equivalent practical experience
  • 3-5+ years of experience in data engineering roles is required, with at least 2 years of hands-on experience with Azure Databricks and Apache Spark
  • Strong programming skills in Python are essential, along with proficiency in SQL and experience with relational and non-relational databases
  • Demonstrated experience building configuration driven ETL and orchestrating data pipelines using tools such as Azure Data Factory, Databricks Workflows, or Apache Airflow
  • Solid understanding of data modeling concepts including dimensional modeling and data vault methodologies, experience with Delta Lake and medallion architecture patterns, and familiarity with Azure services including Azure Data Lake Storage, Azure Data Factory, Azure SQL Database, REST API, SFTP, and Azure Key Vault
  • Proficiency in Git for version control, including branching strategies, pull requests, and collaborative development workflows, along with CI/CD practices for data pipelines is expected
  • Experience with streaming data processing using Structured Streaming or Event Hubs, knowledge of infrastructure as code using Terraform/Terragrunt or ARM templates, and familiarity with data governance tools and practices are valued
  • Experience with Unity Catalog for data governance and understanding of data security and compliance frameworks round out the ideal candidate profile
  • Familiarity with Azure Data Factory and Azure SQL DB/DW

Technical Skills:

  • Azure Databricks and Apache Spark
  • Python and PySpark
  • SQL, particularly Spark SQL
  • Azure Data Lake Storage (ADLS)
  • Delta Lake and Lakehouse architecture
  • Git and GitHub
  • Data orchestration and workflow management
  • Airflow and Databricks workflows
  • Performance tuning and optimization
  • Data quality and testing frameworks

Benefits & conditions

The pay range is the lowest to highest compensation we reasonably in good faith believe we would pay at posting for this role. We may ultimately pay more or less than this range. Employee pay is based on factors like relevant education, qualifications, certifications, experience, skills, seniority, location, performance, union contract and business needs. This range may be modified in the future.

We offer comprehensive benefits including medical/dental/vision insurance, HSA, FSA, 401(k), and life, disability & ADD insurance to eligible employees. Salaried personnel receive paid time off. Hourly employees are not eligible for paid time off unless required by law. Hourly employees on a Service Contract Act project are eligible for paid sick leave.

Note: Pay is not considered compensation until it is earned, vested and determinable. The amount and availability of any compensation remains in Kforce's sole discretion unless and until paid and may be modified in its discretion consistent with the law.

Apply for this position