Data Linkage Engineer
Role details
Job location
Tech stack
Job description
Verify and validate record linkage during and after migration, ensuring the new system matches or improves existing behavior.
Analyze large datasets (millions to billions of records) to define metrics, measure linkage quality, and interpret results.
Investigate discrepancies between old and new linkage outputs and determine root causes.
Collaborate with internal teams to design and implement robust linkage solutions.
Contribute to the long-term development of advanced data-linkage strategies post-migration.
Troubleshoot blockers and unexpected issues, working reactively and creatively.
Requirements
Strong data linkage/record linkage Experience - this is the number one requirement.
Experience working with very large datasets (hundreds of millions to billions of rows).
Ability to define, measure, and interpret statistical or analytical metrics related to linkage quality.
Background in software engineering fundamentals and problem-solving.
Cloud Experience (Azure preferred) -not essential but beneficial.
Experience with distributed or big-data technologies (eg, PySpark, Spark, CosmosDB).
Nice-to-Have
Java Experience - not essential; can be taught if needed.
A degree in mathematics, physics, or a related analytical field.
Familiarity with identity verification, fraud detection, insurance data, or similar domains.