Data Engineer

CARITATECH LLC

Phoenix, United States of America

yesterday

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Experience level

Senior

Phoenix, United States of America

Artificial Intelligence

Amazon Web Services (AWS)

Azure

Big Data

Cloud Computing

Data Architecture

Information Engineering

ETL

Distributed Systems

Python

Standard Sql

Search Technologies

Data Streaming

Google Cloud Platform

Retrieval-Augmented Generation

Large Language Models

Multi-Agent Systems

Prompt Engineering

Spark

Generative AI

Build Management

Machine Learning Operations

Data Pipelines

Databricks

Design and build scalable data pipelines to support AI/ML and GenAI workloads
Develop and deploy RAG (Retrieval-Augmented Generation) architectures using vector databases
Build and manage LLM-powered applications, including custom GPTs and enterprise AI assistants
Implement agentic workflows using frameworks like LangChain, AutoGen, or similar
Integrate structured and unstructured data sources for AI consumption
Optimize data models and pipelines for performance, scalability, and reliability
Collaborate with data scientists, ML engineers, and business stakeholders to deliver AI-driven solutions
Ensure data quality, governance, and security across AI pipelines
Work with cloud platforms (AWS/Azure/Google Cloud Platform) for scalable GenAI deployments

7+ years of Data Engineering experience (ETL, data pipelines, big data processing)
Proven hands-on experience in Generative AI / LLM-based solutions
Strong experience building RAG pipelines with vector databases (Pinecone, FAISS, Weaviate, etc.)
Experience with agent frameworks (LangChain, LangGraph, AutoGen, CrewAI, etc.)
Solid programming skills in Python
Experience working with LLMs (OpenAI, Azure OpenAI, Claude, etc.)
Strong SQL and data modeling skills
Experience with Spark, Databricks, or similar big data technologies
Hands-on experience with cloud platforms (AWS/Azure/Google Cloud Platform), * Experience building custom GPTs or AI copilots
Familiarity with prompt engineering, embeddings, and vector search
Knowledge of ML Ops / LLM Ops practices
Exposure to real-time or streaming data pipelines
Strong understanding of data architecture and distributed systems