Data Engineer

| India

6 days ago

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Experience level

Senior

Job location

Tech stack

API

Artificial Intelligence

Amazon Web Services (AWS)

Data analysis

Continuous Integration

Data Architecture

Data Validation

Information Engineering

Data Integration

ETL

Python

PostgreSQL

Performance Tuning

Standard Sql

SQL Databases

Systems Integration

Management of Software Versions

Data Processing

Enterprise Software Applications

Data Ingestion

Azure

Data Layers

Data Pipelines

Job description

As a Data Analyst & Engineer for O2C Phase-1, you will build robust batch pipelines into a managed PostgreSQL data layer to ingest from CUBE/RegBook, MetricStream and the client's Entity master. You will implement high-quality, auditable data flows with strong contracts, lineage and idempotency.

You will collaborate with the Data Architect, Integrations Engineer and Reporting to deliver reliable datasets and views that power persona-based dashboards.

Key Responsibilities

Pipeline Engineering

Build and operate batch ingestion jobs (files/APIs) with retries, alerting and replay.
Implement source-to-target mappings, data quality checks, and schema evolution safely.

Data Layer build

Create and optimize tables, indexes and views for analytics and application use.
Contribute to PDM standards, partitioning, retention and performance baselines.

Lineage & Controls

Capture lineage and provenance; ensure auditability of changes and versioning.
Handle PII/sensitive fields per policy; follow least-privilege patterns.

Collaboration

Work with data integrations to stabilize upstream feeds; support reporting on semantic models.
Support QA with data fixtures and automated validation for UAT.

Requirements

5-9 years in data engineering with strong SQL and ETL/ELT experience.
Proficiency in Python and SQL for data manipulation and data analysis.
Hands-on experience with AWS services including Postgres, Step Functions, Lambda, Glue, S3.
Strong understanding of data modelling, schema design, and performance tuning.
Experience integrating with enterprise systems via batch/APIs; strong understanding of DQ and idempotency.
Familiarity with Azure data services and CI/CD for data pipelines and AWS Sage Maker is a plus.

Build batch ingestion pipelines into managed PostgreSQL (Flexible Server) for CUBE/RegBook, MetricStream and Entity Master. Implement source-to-target mappings, data quality checks, idempotent loads, lineage capture and schedules per O2C Phase-1 scope (no AI). Optimize schemas, indexes and views used by reporting.