Data Engineer
Triwave Solutions
2 days ago
Role details
Contract type
Permanent contract Employment type
Full-time (> 32 hours) Working hours
Regular working hours Languages
English Experience level
JuniorJob location
Tech stack
API
Airflow
Amazon Web Services (AWS)
Azure
Big Data
Google BigQuery
Health Informatics
Data Architecture
Information Engineering
Data Governance
ETL
Data Security
Github
Hadoop
JSON
Python
PostgreSQL
Metadata
Microsoft SQL Server
MySQL
Oracle Applications
Performance Tuning
Scrum
SAS (Software)
SQL Stored Procedures
SQL Databases
Systems Integration
XML
Snowflake
Spark
Cloudformation
Pandas
Data Lake
Collibra
Kafka
Spark Streaming
Terraform
Redshift
Job description
The scope of the proposed services will include the following:
- Assess feasibility and technical requirements for LINKS DataLake integration.
- Collaborate with OPH Immunization Program, OPH Bureau of Health Informatics and
STChealth on data specifications and recurring ingestion pipelines.
- Build and optimize ETL workflows for LINKS and complementary datasets (Vital
Records, labs, registries).
- Design scalable data workflows to improve data quality, integrity, and identity resolution.
- Implement data governance, observability, and lineage tracking across all pipelines.
- Mentor engineers, support testing, and enforce best practices in orchestration and
Requirements
Do you have experience in XML?, Expertise and/or relevant experience in the following areas are mandatory:
- 3 years of experience in data engineering and/or data architecture
- 2 years of experience with Python for ETL and automation (pandas, requests, API
integration).
- 2 years hands-on experience with SQL queries, stored procedures, performance tuning
(preferable Oracle, SQL Server, MySQL)
- 1 year experience with ETL orchestration tools (Prefect, Airflow or equivalent).
10
- 1 year experience with cloud platforms (Azure, AWS, or GCP), including data
onboarding/migration.
- 1 year exposure to data lake / medallion architecture (bronze, silver, gold)
- 2 years of experience providing written documentation and verbal communication for cross
functional collaboration. Expertise and/or relevant experience in the following areas are desirable but not mandatory:
- 5+ years of experience in data engineering roles
- Experience integrating or developing REST/JSON or XML APIs
- Familiarity with CI/CD pipelines (GitHub Actions, Azure DevOps, etc.).
- Exposure to Infrastructure as Code experience (Terraform, CloudFormation).
- Experience with data governance and metadata tools (Atlan, OpenMetadata, Collibra).
- Public health/healthcare dataset or similar experience, including PHI/PII handling.
- Familiarity with SAS and R workflows to support epidemiologists and analysts.
- Experience with additional SQL platforms (Postgres, Snowflake, Redshift, BigQuery).
- Familiarity with data quality frameworks (Great Expectations, Deequ).
- Experience with real-time/streaming tools (Kafka, Spark Streaming).
- Familiarity with big data frameworks for large-scale transformations (Spark, Hadoop).
- Knowledge of data security and compliance frameworks (HIPAA, SOC 2, etc.).
- Agile/SCRUM team experience.