Data Engineer

Triwave Solutions

2 days ago

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Experience level

Junior

Job location

Tech stack

API

Airflow

Amazon Web Services (AWS)

Azure

Big Data

Google BigQuery

Health Informatics

Data Architecture

Information Engineering

Data Governance

ETL

Data Security

Github

Hadoop

JSON

Python

PostgreSQL

Metadata

Microsoft SQL Server

MySQL

Oracle Applications

Performance Tuning

Scrum

SAS (Software)

SQL Stored Procedures

SQL Databases

Systems Integration

XML

Snowflake

Spark

Cloudformation

Pandas

Data Lake

Collibra

Kafka

Spark Streaming

Terraform

Redshift

Job description

The scope of the proposed services will include the following:

Assess feasibility and technical requirements for LINKS DataLake integration.
Collaborate with OPH Immunization Program, OPH Bureau of Health Informatics and

STChealth on data specifications and recurring ingestion pipelines.

Build and optimize ETL workflows for LINKS and complementary datasets (Vital

Records, labs, registries).

Design scalable data workflows to improve data quality, integrity, and identity resolution.
Implement data governance, observability, and lineage tracking across all pipelines.
Mentor engineers, support testing, and enforce best practices in orchestration and

Requirements

Do you have experience in XML?, Expertise and/or relevant experience in the following areas are mandatory:

3 years of experience in data engineering and/or data architecture
2 years of experience with Python for ETL and automation (pandas, requests, API

integration).

2 years hands-on experience with SQL queries, stored procedures, performance tuning

(preferable Oracle, SQL Server, MySQL)

1 year experience with ETL orchestration tools (Prefect, Airflow or equivalent).

1 year experience with cloud platforms (Azure, AWS, or GCP), including data

onboarding/migration.

1 year exposure to data lake / medallion architecture (bronze, silver, gold)
2 years of experience providing written documentation and verbal communication for cross

functional collaboration. Expertise and/or relevant experience in the following areas are desirable but not mandatory:

5+ years of experience in data engineering roles
Experience integrating or developing REST/JSON or XML APIs
Familiarity with CI/CD pipelines (GitHub Actions, Azure DevOps, etc.).
Exposure to Infrastructure as Code experience (Terraform, CloudFormation).
Experience with data governance and metadata tools (Atlan, OpenMetadata, Collibra).
Public health/healthcare dataset or similar experience, including PHI/PII handling.
Familiarity with SAS and R workflows to support epidemiologists and analysts.
Experience with additional SQL platforms (Postgres, Snowflake, Redshift, BigQuery).
Familiarity with data quality frameworks (Great Expectations, Deequ).
Experience with real-time/streaming tools (Kafka, Spark Streaming).
Familiarity with big data frameworks for large-scale transformations (Spark, Hadoop).
Knowledge of data security and compliance frameworks (HIPAA, SOC 2, etc.).
Agile/SCRUM team experience.