Data Engineer

Optomi LLC
Hoboken, United States of America
2 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English

Job location

Hoboken, United States of America

Tech stack

Artificial Intelligence
Amazon Web Services (AWS)
Databases
Continuous Integration
Data Validation
Python
Performance Tuning
Standard Sql
SQL Databases
Data Streaming
Systems Integration
Flask
Large Language Models
Concurrency
Backend
FastAPI
Event Driven Architecture
PySpark
Data Management
Api Design
Data Pipelines
Databricks

Job description

  • Practical experience integrating AI and Large Language Models (LLMs) into data platforms and workflows
  • Leveraging AI technologies to enhance data quality and observability
  • Automating repetitive processes

What the responsibilities are of the right candidate:

  • Build and support enterprise-grade production systems
  • Optimize batch/streaming data pipelines
  • Integrate AI and LLMs into data platforms
  • Enhance data quality and automate processes

Requirements

Overview: We are seeking a skilled professional to build and support enterprise-grade production systems. The ideal candidate will have strong hands-on skills in PySpark, Python, and SQL, with experience in building, operating, and optimizing data pipelines. Additionally, candidates with application engineering experience should have proficiency in building APIs using FastAPI or Flask and possess robust data modeling skills. Experience integrating AI and LLMs into workflows to enhance data quality and automate processes is highly desirable.

Job Must Haves:

  • Strong hands-on skills in PySpark, Python, and SQL
  • Databricks
  • AWS
  • Experience building, operating, and optimizing batch/streaming data pipelines
  • Experience with data quality checks and performance tuning in production
  • Experience building APIs and backend services using FastAPI or Flask
  • Strong data modeling skills (e.g., Pydantic)
  • Experience with event-driven architectures, concurrency/async processing, database integration, testing, CI/CD, containers, and production observability

Apply for this position