Data Engineer, Unified Platform
Role details
Job location
Tech stack
Job description
-
Work closely with Data Strategists to determine appropriate data sources and implement processes to onboard and manage new data sources for trading, research, and back-office purposes.
-
Contribute to data governance processes that enable discovery, cost-sharing, usage tracking, access controls, and quality control of datasets to address the needs of DRW trading teams and strategies.
-
Continually monitor data ingestion pipelines and data quality to ensure stability, reliability, and quality of the data. Contribute to the monitoring and quality control software and processes.
-
Own the technical aspects of vendor ingestion pipelines, coordinating with vendor relationship managers on upcoming changes, performing routine data operations without breaking internal users, and contributing to the team's on-call rotation to respond to unanticipated changes.
-
Rapidly respond to user requests, identifying platform gaps and self-service opportunities that make the user experience more efficient.
Requirements
-
Have experience designing and building data pipelines
-
Have experience working within modern batch or streaming data ecosystems
-
An expert in SQL and have experience in Java or Python
-
Can apply data modeling techniques
-
Able to own the delivery of data products, working with analysts and stakeholders to understand requirements and implement solutions
-
Able to contribute to project management and project reporting, * 3+ years of experience working with modern data technologies and/or building data-first products.
-
Excellent written and verbal communication skills.
-
Proven ability to work in a collaborative, agile, and fast-paced environment, prioritizing multiple tasks and projects, and efficiently handle the demands of a trading environment.
-
Proven ability to deliver rapid results within processes that span multiple stakeholders.
-
Strong technical problem-solving skills.
-
Extensive familiarity with SQL and Java or Python, with a proven ability to develop and deliver maintainable data tranformations for production data pipelines.
-
Experience leveraging data modeling techniques and ability to articulate the trade-offs of different approaches.
-
Experience with one or more data processing technologies (e.g. Flink, Spark, Polars, Dask, etc.)
-
Experience with multiple data storage technologies (e.g. S3, RDBMS, NoSQL, Delta/Iceberg, Cassandra, Clickhouse, Kafka, etc.) and knowledge of their associated trade-offs.
-
Experience with multiple data formats and serialization systems (e.g. Arrow, Parquet, Protobuf/gRPC, Avro, Thrift, JSON, etc.)
-
Experience managing data pipeline orchestration systems (e.g. Kubernetes, Argo Workflows, Airflow, Prefect, Dagster, etc.)
-
Proven experience in managing the operational aspects of large data pipelines such as backfilling datasets, rerunning batch jobs, and handling dead-letter queues.
-
Prior experience triaging data quality control processes, correcting data gaps and inaccuracies.