Data Engineer

Innodata Inc
Ridgefield Park, United States of America
3 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Regular working hours
Languages
English
Compensation
$ 120K

Job location

Remote
Ridgefield Park, United States of America

Tech stack

Query Performance
API
Artificial Intelligence
Amazon Web Services (AWS)
Google BigQuery
Cloud Storage
Data Centers
Information Engineering
Data Governance
ETL
Data Visualization
Data Warehousing
Database Queries
Data Flow Control
Python
Machine Learning
Systems Development Life Cycle
Power BI
SQL Databases
Tableau
Unstructured Data
Scripting (Bash/Python/Go/Ruby)
Build Management
Data Lake
Data Analytics
Machine Learning Operations
Looker Analytics
Data Pipelines

Job description

We are seeking a Data Engineer to design and build enterprise data warehouses, data lakes, and pipelines that power data-driven decision-making for data center supply chain and real estate operations. This role is responsible for creating scalable, secure, and optimized ETL infrastructure on GCP/AWS, while enabling advanced AI/ML use cases such as RAG, copilots, and agentic AI for predictive analytics and workflow automation., * Design and implement data-driven solutions on GCP including BigQuery, Cloud Storage, Dataflow, Pub/Sub, and Looker/BI.

  • Build ETL scripts using SQL and Python to extract, clean, and transform structured and unstructured data from ERP, procurement, logistics, and facility management systems.
  • Develop and optimize data pipelines for ingestion, transformation, and loading into enterprise data lakes and warehouses.
  • Build and extend end-to-end data and BI solutions, spanning extraction, storage, transformation, and visualization layers.
  • Partner with supply chain, real estate, and AI/ML teams to provide pipelines for AI solutions (e.g., RAG ingestion, Copilot integration, multi-agent workflows).
  • Ensure data governance, lineage, and compliance across supply chain datasets.
  • Continuously optimize query performance, ETL processes, and pipeline reliability.

Requirements

Do you have experience in System development?, * Advanced proficiency in SQL (complex queries, optimization) and Python (data engineering, scripting, APIs).

  • Experience building ETL/ELT pipelines operating on structured and unstructured data sources.
  • Knowledge of enterprise data warehouse and data lake architectures.
  • Exposure to data pipelines for AI/ML (vector DB ingestion, embeddings, RAG pipelines, copilots, agents).
  • Familiarity with supply chain or data center operations data is a strong plus.
  • Bonus: experience with ML Engineering, data visualization tools (Looker, Tableau, Power BI) and MLOps practices.
  • Strong hands-on expertise with GCP services: BigQuery, Dataflow, Pub/Sub, Cloud Storage, Looker/BI (or similar, preferred).

Benefits & conditions

4.14.1 out of 5 stars Ridgefield Park, NJ Remote $100,000 - $120,000 a year, The expected salary range for this position is $100,000 - $120,000 USD per year, based on experience, skills, and qualifications.

About the company

Innodata (Nasdaq: INOD) is a global data engineering company. We believe that data and Artificial Intelligence (AI) are inextricably linked. Our mission is to enable the responsible advancement of artificial intelligence by providing the data, evaluation frameworks, and human expertise required to build AI systems that can be trusted at scale. We provide a range of transferable solutions, platforms, and services for Generative AI / AI builders and adopters. In every relationship, we honor our 36+ year legacy delivering the highest quality data and outstanding outcomes for our customers.

Apply for this position