Data Engineer

CARITATECH LLC
Phoenix, United States of America
yesterday

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior

Job location

Phoenix, United States of America

Tech stack

Artificial Intelligence
Amazon Web Services (AWS)
Azure
Big Data
Cloud Computing
Data Architecture
Information Engineering
ETL
Distributed Systems
Python
Standard Sql
Search Technologies
Data Streaming
Google Cloud Platform
Retrieval-Augmented Generation
Large Language Models
Multi-Agent Systems
Prompt Engineering
Spark
Generative AI
Build Management
Machine Learning Operations
Data Pipelines
Databricks

Job description

  • Design and build scalable data pipelines to support AI/ML and GenAI workloads
  • Develop and deploy RAG (Retrieval-Augmented Generation) architectures using vector databases
  • Build and manage LLM-powered applications, including custom GPTs and enterprise AI assistants
  • Implement agentic workflows using frameworks like LangChain, AutoGen, or similar
  • Integrate structured and unstructured data sources for AI consumption
  • Optimize data models and pipelines for performance, scalability, and reliability
  • Collaborate with data scientists, ML engineers, and business stakeholders to deliver AI-driven solutions
  • Ensure data quality, governance, and security across AI pipelines
  • Work with cloud platforms (AWS/Azure/Google Cloud Platform) for scalable GenAI deployments

Requirements

  • 7+ years of Data Engineering experience (ETL, data pipelines, big data processing)
  • Proven hands-on experience in Generative AI / LLM-based solutions
  • Strong experience building RAG pipelines with vector databases (Pinecone, FAISS, Weaviate, etc.)
  • Experience with agent frameworks (LangChain, LangGraph, AutoGen, CrewAI, etc.)
  • Solid programming skills in Python
  • Experience working with LLMs (OpenAI, Azure OpenAI, Claude, etc.)
  • Strong SQL and data modeling skills
  • Experience with Spark, Databricks, or similar big data technologies
  • Hands-on experience with cloud platforms (AWS/Azure/Google Cloud Platform), * Experience building custom GPTs or AI copilots
  • Familiarity with prompt engineering, embeddings, and vector search
  • Knowledge of ML Ops / LLM Ops practices
  • Exposure to real-time or streaming data pipelines
  • Strong understanding of data architecture and distributed systems

Apply for this position