Databricks Data Engineer - Remote

PALMETTO TECHNOLOGIES, LLC
2 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Intermediate
Compensation
$ 141K

Job location

Remote

Tech stack

Unity
API
Artificial Intelligence
Amazon Web Services (AWS)
Audit Trail
Azure
Software as a Service
Cloud Storage
Databases
Continuous Integration
Data Dictionary
Information Engineering
Data Governance
Data Integration
Relational Databases
Software Debugging
Github
Python
Key Management
Machine Learning
Open Source Technology
Operational Databases
Performance Tuning
Cloud Services
Search Technologies
Software Engineering
SQL Databases
Data Streaming
Unstructured Data
Enterprise Data Management
Azure
Google Cloud Platform
Enterprise Software Applications
Retrieval-Augmented Generation
Delivery Pipeline
Large Language Models
Prompt Engineering
Spark
Generative AI
Data Layers
Data Lake
PySpark
Deployment Automation
HuggingFace
Real Time Data
Kafka
Machine Learning Operations
Virtual Agents
REST
Terraform
Data Pipelines
Jenkins
Databricks

Job description

  • Design, develop, and optimize scalable data pipelines using Databricks, Apache Spark, PySpark, Python, and SQL.
  • Implement enterprise lakehouse architectures using Delta Lake, Delta Live Tables, Unity Catalog, and Databricks SQL.
  • Build batch and real-time data-processing solutions using technologies such as Structured Streaming, Kafka, Event Hubs, or Kinesis.
  • Develop reusable ingestion and transformation frameworks for structured, semi-structured, and unstructured data.
  • Design and implement medallion architectures using Bronze, Silver, and Gold data layers.
  • Develop agentic AI applications capable of planning tasks, retrieving information, calling tools, executing workflows, and validating results.
  • Build AI agents using frameworks such as Databricks Mosaic AI Agent Framework, LangChain, LangGraph, LlamaIndex, AutoGen, or Semantic Kernel.
  • Develop retrieval-augmented generation solutions using vector embeddings, semantic search, metadata filtering, and enterprise knowledge repositories.
  • Implement vector-search solutions using Databricks Vector Search or comparable vector databases.
  • Integrate large language models with enterprise applications, APIs, databases, document repositories, and business workflows.
  • Implement model and agent evaluation processes covering accuracy, groundedness, relevance, hallucination, safety, latency, and cost.
  • Develop agent memory, prompt-management, context-management, tool-calling, guardrail, and human-in-the-loop capabilities.
  • Use MLflow for experiment tracking, model registration, evaluation, deployment, and lifecycle management.
  • Build and support machine-learning and generative-AI deployment pipelines using Databricks Model Serving.
  • Implement CI/CD processes for notebooks, pipelines, infrastructure, models, prompts, and AI agents.
  • Automate Databricks deployments using Databricks Asset Bundles, Terraform, Azure DevOps, GitHub Actions, or Jenkins.
  • Establish data quality controls, reconciliation procedures, lineage, observability, monitoring, and alerting.
  • Implement security and governance using Unity Catalog, role-based access controls, row-level security, column masking, secrets management, and audit logging.
  • Optimize Spark workloads, cluster configurations, SQL queries, data layouts, and storage costs.
  • Troubleshoot production data pipelines, AI-agent workflows, integrations, model endpoints, and performance issues.
  • Collaborate with data architects, data scientists, machine-learning engineers, application developers, cybersecurity teams, and business stakeholders.
  • Create technical designs, architecture diagrams, data mappings, data dictionaries, runbooks, and operational documentation.
  • Mentor team members and establish engineering standards for Databricks, data pipelines, and agentic AI development.

Requirements

Do you have experience in Testing and evaluation?, * 7 or more years of experience in data engineering, software engineering, or enterprise data-platform development.

  • 4 or more years of hands-on experience with the Databricks platform.
  • 5 or more years of experience developing solutions using Python, PySpark, Apache Spark, and SQL.
  • Experience designing and implementing Databricks Lakehouse architectures.
  • Experience with Delta Lake, Delta Live Tables, Databricks Workflows, Databricks SQL, and Unity Catalog.
  • Experience developing batch and streaming data pipelines.
  • Experience designing data models and medallion architectures.
  • Hands-on experience developing generative AI, retrieval-augmented generation, or agentic AI solutions.
  • Experience building AI agents that use tools, APIs, enterprise data, and multi-step reasoning workflows.
  • Experience with at least one agent-development framework, such as:
  • Databricks Mosaic AI Agent Framework
  • LangChain
  • LangGraph
  • LlamaIndex
  • AutoGen
  • Semantic Kernel
  • Experience integrating large language models from platforms such as Azure OpenAI, OpenAI, Anthropic, AWS Bedrock, Hugging Face, or open-source model providers.
  • Experience with vector embeddings, vector search, semantic retrieval, document chunking, reranking, and prompt engineering.
  • Experience with MLflow, model registration, model serving, and AI application evaluation.
  • Experience integrating data from relational databases, REST APIs, cloud storage, event streams, SaaS applications, and document repositories.
  • Experience with cloud services in at least one of the following:
  • Microsoft Azure
  • Amazon Web Services
  • Google Cloud Platform
  • Experience implementing automated deployments and CI/CD pipelines.
  • Experience with data governance, security, lineage, access controls, and personally identifiable information protection.
  • Strong knowledge of data-quality testing, monitoring, debugging, and performance optimization.
  • Strong written and verbal communication skills., * Excellent communication skills - Very important

Benefits & conditions

$65 - $68 an hour - Contract

Apply for this position