Data Engineer (Chinese Speaking)
Role details
Job location
Tech stack
Job description
We are looking for a skilled and experienced Data Engineer to join our Data team.
In this role, you will design, build and maintain scalable data pipelines and robust data models to support analytics, reporting, operational workflows, back-office and risk systems, and product data needs. You will work closely with Data Analysts / Data Scientists, and Business stakeholders to provide clean, reliable, and high-quality data that supports data-driven decisions.
You'll be responsible for turning raw data from multiple sources into well-structured, analysis-ready datasets, and building the backbone of our data platform to meet both current and future business demands., * Design, implement and maintain scalable, robust data pipelines (batch and streaming) for ingestion, transformation, and integration of data from diverse internal systems.
- Build and maintain data models, schemas, and data tables (warehouse/lakehouse) that support analytics, reporting, and operational workloads.
- Develop ETL/ELT workflows, transformation logic, aggregation and enrichment logic to produce clean, high-quality, analysis-ready datasets.
- Collaborate with Data Analysts, Data Scientists, and Business stakeholders to gather requirements, translate them into data specifications and data structures.
- Optimize data storage and processing performance: manage partitioning, indexing, schema design, table layout, resource allocation for efficient processing and query performance.
- Maintain and document data architecture, source-to-target mappings, lineage definitions, and schema versions; ensure clarity and maintainability of data assets.
- Ensure data quality, consistency and reliability so downstream analytics, reporting and operations teams can trust the data.
Requirements
Do you have experience in Spark?, * 2+ years of experience in Data Engineering or similar data-intensive engineering role.
- Strong proficiency in SQL and at least one programming language (e.g. Python)
- Hands-on experience with batch and streaming data processing, using frameworks such as Spark, Flink, or similar distributed processing frameworks.
- Familiarity with modern data lakehouse or data warehouse technologies, such as Delta Lake, Apache Hudi, ClickHouse, Doris.
- Strong understanding of data modelling principles, schema design, partitioning strategy, data, and data architecture patterns.
- Proven skills in writing clean, maintainable, and well-documented data transformation code; ability to design pipelines that are robust, testable, and scalable.
- Ability to communicate effectively with both technical and non-technical stakeholders and translate business requirements into technical data solutions.
- Good problem-solving ability, attention to detail, and ability to troubleshoot complex data issues and performance bottlenecks.
- Mandarin proficiency is preferred, * Experience with containerization or infrastructure tooling (e.g. Docker, Kubernetes), or involvement in CI/CD workflows.
- Experience working on large-scale data systems, high-volume data ingestion, distributed storage, and analytical workloads.
- Exposure to supporting machine learning pipelines or data science workflows.
- Familiarity with cloud concepts is a plus.