Data Bricks

Covetus, LLC

Fort Worth, United States of America

2 days ago

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Job location

Fort Worth, United States of America

Tech stack

Artificial Intelligence

Amazon Web Services (AWS)

Big Data

Data Infrastructure

ETL

Data Systems

Distributed Computing Environment

Distributed Systems

Hadoop

Python

Machine Learning

Performance Tuning

Cloudera

SAS (Software)

Data Storage Management

Large Language Models

Spark

Data Lake

PySpark

Data Management

Machine Learning Operations

Data Pipelines

Amazon Web Services (AWS)

Databricks

Data Generation

Job description

As a Databricks Data Engineer, you will be responsible for designing, developing, and maintaining data solutions for data generation, collection, and processing in Big Data environment using predominantly PySpark/Python. Your typical day will involve creating data pipelines, ensuring data quality, and implementing ETL processes to migrate and deploy data across systems using PySpark.

Roles & Responsibilities:

Collaborate closely with data scientists, data engineers, and business stakeholders to gather requirements and understand the business objectives driving data pipeline development.
Design, develop, and maintain robust, scalable high-performance Data Pipelines using Databricks.
Leverage Databricks features such as Lakehouse and Delta Lake for efficient data storage and Spark for distributed processing
Develop ETL/ELT pipeline using Databricks
Monitor pipeline health, troubleshoot data issues
Migrate on Prem Pyspark, SAS data pipeline and ML Models to Databricks
Define and implement best practices in Databricks
Evaluate new Databricks features and tools, helping the organization stay at the forefront of innovation in AI and Big Data
Collaborate with cross-functional teams to identify and resolve data-related issues.

Requirements

Proven expertise in implementing Lakehouse and Delta Lake using Databricks.
Strong PySpark and Python experience
Databricks Certified Data Engineer Professional Certification
Familiarity with ML Ops/LLM Ops and distributed systems.
Experience with Big Data platform like Cloudera Hadoop and Could platforms like AWS, GCP.
Solid understanding of system design patterns, scalability, observability, and performance tuning.
Strong analytical and problem-solving skills.
Passion for exploring and building with emerging technologies.

Good to Have Skills:

AWS EKS Experience, Dockers and Containers

Role details

Job location

Tech stack

Job description

Requirements

Apply for this position

Good distractions

Moments

Videos View all