Data Engineer (AWS)- IND

Insight Global
St. Louis Park, United States of America
4 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Experience level
Senior

Job location

St. Louis Park, United States of America

Tech stack

Artificial Intelligence
Amazon Web Services (AWS)
Azure
Cloud Computing
Encodings
Continuous Integration
Information Engineering
ETL
Python
Search Technologies
Software Construction
Unstructured Data
AI Infrastructure
Data Logging
Enterprise Software Applications
Large Language Models
Generative AI
AWS Lambda
Data Pipelines
Serverless Computing

Job description

Insight Global is seeking an AI Data Engineer with deep expertise in AWS to join one of our medical device clients local to India. In this role, you will develop scalable ETL/ELT pipelines to ingest and process large volumes of unstructured data, leveraging cloud-native services such as Azure Document Intelligence, AWS Textract, and other OCR tools to extract and prepare content at scale. You will architect and manage high-quality embedding pipelines, chunking strategies, and vector database integrations-including platforms like Azure AI Search-to support Retrieval-Augmented Generation (RAG) workflows and intelligent search capabilities. You will build retrieval and orchestration pipelines that connect data to LLMs, implement resilient CI/CD workflows, and ensure strong logging, monitoring, and error handling for production reliability. Working closely with platform and application teams, you will integrate LLM-powered features into enterprise applications and deploy services using functions, containers, and event-driven cloud technologies such as AWS Lambda and Azure Functions. A strong focus on security, compliance, scalability, and performance will be critical as you help advance the organization's AI engineering ecosystem.

Requirements

5+ years in data engineering or AI infrastructure roles

-Expertise in AWS

-Hands-on experience with vector stores and embedding pipelines

-Strong Python development experience

-Experience with OCR/document intelligence tools

-Strong familiarity with LLMs, RAG architectures, embeddings, and retrieval techniques

-Experience with CICD and software engineering best practices

-Experience building and maintaining data pipelines

Apply for this position