Data Engineer
Javen Technologies, Inc
Chicago, United States of America
yesterday
Role details
Contract type
Temporary to permanent Employment type
Full-time (> 32 hours) Working hours
Regular working hours Languages
English Experience level
SeniorJob location
Chicago, United States of America
Tech stack
Agile Methodologies
Airflow
Amazon Web Services (AWS)
Azure
Cloud Computing
Data Architecture
Information Engineering
Data Governance
Data Infrastructure
Data Integrity
ETL
Data Security
Data Systems
Identity and Access Management
Machine Learning
Operational Databases
Standard Sql
SQL Databases
Data Streaming
Tableau
Technical Data Management Systems
Unstructured Data
Data Processing
Feature Engineering
System Availability
Spark
Event Driven Architecture
Data Lake
PySpark
Collibra
Data Management
Machine Learning Operations
Data Lakehouse
Api Design
Data Pipelines
Serverless Computing
Alteryx
Databricks
Control M
Job description
Data Engineering & Pipeline Development:
- Design, develop, and maintain end-to-end data pipelines in Databricks using Spark and Delta Lake
- Build and optimize ELT/ETL processes for structured and unstructured data ingestion into the Data Lakehouse
- Implement scalable ingestion patterns (batch and event-driven) from internal systems, third-party APIs, and cloud sources
- Develop data models (bronze, silver, gold layers) to support enterprise reporting, analytics, and downstream consumption
Data Platform & Integration:
- Integrate the Data Lakehouse with enterprise tools such as Tableau, Alteryx, and machine learning platforms
- Design and implement data access controls, identity management, and secure data sharing mechanisms
- Support API-based integrations and downstream data consumption patterns
Data Quality, Governance & Controls:
- Implement data quality checks, reconciliation processes, and monitoring within Databricks pipelines
- Ensure adherence to enterprise data governance standards, including lineage, metadata, and audit requirements
- Support regulatory and compliance requirements (e.g., data integrity, privacy, and security controls)
Cloud & Automation:
- Develop and manage workflows using orchestration tools (e.g., Airflow, Control-M)
- Automate data pipelines, deployments, and operational processes through CI/CD pipelines
- Leverage cloud-native services (AWS/Azure) for data processing, storage, and event-driven architectures
- Operations & SupportMonitor, troubleshoot, and optimize data pipelines and Spark workloads for performance and reliability
- Support production data platforms, including incident resolution and root cause analysis
- Ensure high availability, data integrity, and SLA adherence across enterprise data systems
Collaboration:
- Partner with data architects, data scientists, BI teams, and business stakeholders to deliver data solutions
- Participate in Agile ceremonies and contribute to iterative delivery of data products
- Translate business requirements into scalable technical data solutions
Requirements
- Required Qualifications3+ years of experience in data engineering, data platforms, or related roles
- Hands-on experience with Databricks, Apache Spark (PySpark), and Delta Lake
- Strong SQL and data modeling skills (relational and dimensional)
- Experience building and supporting data pipelines in a cloud environment (AWS or Azure)
- Experience with ELT/ETL tools (e.g., Fivetran, custom ingestion frameworks)
- Familiarity with data orchestration tools (Airflow, Control-M)
- Experience working in Agile development environments, * Experience in financial services or regulated environments (e.g., banking, risk, regulatory reporting)
- Knowledge of data governance frameworks and tools (e.g., Collibra)
- Experience with real-time or streaming data pipelines
- Exposure to machine learning pipelines and feature engineering in Databricks
- Cloud certifications (AWS, Azure, or Databricks)
Technical Skills:
- Databricks (Lakehouse architecture, notebooks, jobs, Unity Catalog)
- Spark / PySpark
- SQL (advanced querying and optimization)