Databricks Engineer

CogniSoft Technologies
4 days ago

Role details

Contract type
Temporary to permanent
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English

Job location

Remote

Tech stack

Artificial Intelligence
Amazon Web Services (AWS)
Amazon Web Services (AWS)
Cloud Computing
Continuous Integration
Information Engineering
ETL
Data Masking
DevOps
Python
Meta-Data Management
Performance Tuning
Role-Based Access Control
Standard Sql
SQL Databases
Data Streaming
Data Processing
Data Storage Technologies
Data Ingestion
DevOps Tools - Open-source
Spark
Software Troubleshooting
Data Strategy
GIT
Cloudformation
Data Lake
PySpark
Data Lineage
Kafka
Data Lakehouse
Terraform
Stream Processing
Data Pipelines
Databricks

Job description

We are looking for a hands-on Databricks Engineer with strong AWS experience to design, build, and optimize scalable data pipelines and lakehouse solutions. The role focuses on implementing robust batch and streaming data solutions using Databricks, Delta Lake, and AWS cloud-native services, ensuring high performance, scalability, and security. Key Responsibilities

  • Build and maintain end-to-end data pipelines using Databricks, Delta Lake, and AWS services

  • Develop batch, real-time, and streaming data processing workflows

  • Implement data ingestion, transformation, curation, and storage pipelines

  • Build and optimize large-scale PySpark and SQL-based jobs in Databricks

  • Enable real-time data processing using Kafka, AWS Kinesis, or similar streaming tools Data Lakehouse Implementation

  • Work on Databricks-based lakehouse architecture using Delta Lake

  • Implement scalable and optimized data storage and processing frameworks

  • Ensure data quality, consistency, and reliability across pipelines

  • Support metadata management, data lineage, and governance implementation Cloud & Platform Engineering (AWS)

  • Work with AWS services such as S3, Glue, Lambda, Kinesis, and Redshift

  • Ensure pipelines are scalable, secure, and cost-optimized in AWS environments

  • Implement security controls including RBAC, encryption, and data masking Optimization & Best Practices

  • Tune Spark jobs for performance and cost efficiency

  • Monitor and troubleshoot data pipeline issues in production

  • Follow CI/CD and DevOps practices for deploying data engineering solutions

  • Ensure adherence to data engineering standards and best practices Collaboration

  • Work closely with BI teams, and business stakeholders

  • Support analytics and AI/ML data requirements through curated datasets

  • Collaborate with architects to ensure alignment with AWS-based data strategy

Technical Leadership & Architecture

  • Lead the design and implementation of scalable, end-to-end

Requirements

  • Strong hands-on experience with Databricks.

  • Proficiency in Python, PySpark, and SQL

  • Strong experience in AWS cloud services (S3, Glue, Lambda, Kinesis, Redshift)

  • Experience building ETL/ELT data pipelines

  • Strong understanding of Delta Lake and lakehouse concepts

  • Experience with streaming and batch data processing

  • Knowledge of CI/CD tools and Git

  • Strong troubleshooting and performance tuning skills Databricks Certified Data Engineer Professional certification is mandatory

  • IaC (Terraform/CloudFormation)

  • Data quality & observability frameworks

  • Deeper Databricks-specific features (DLT, Unity Catalog, Workflows)

  • Security & compliance depth

  • DevOps tooling specifics

  • Leadership/co

  • Communication expectations

Apply for this position