Data Engineer - QuantumBlack, AI by McKinsey
Role details
Job location
Tech stack
Job description
As a Data Engineer at QuantumBlack, you will work in cross-functional Agile project teams alongside Data Scientists, Machine Learning Engineers, other Data Engineers, Project Managers, and industry experts. You will work hand-in-hand with our clients, from data owners, users, and fellow engineers to C-level executives. You are a highly-collaborative individual who wants to solve problems that drive business value. You have a strong sense of ownership and enjoy hands-on technical work. Our values resonate with yours. As a Data Engineer, you'll:
- Help to build and maintain the technical platform for advanced analytics engagements, spanning data science and data engineering work.
- Design and build data pipelines for machine learning that are robust, modular, scalable, deployable, reproducible, and versioned.
- Create and manage data environments and ensure information security standards are maintained at all times.
- Understand clients data landscape and assess data quality.
- Map data fields to hypotheses and curate, wrangle, and prepare data for use in advanced analytics models.
- Have the opportunity to contribute to R&D projects and internal asset development.
- Contribute to cross-functional problem-solving sessions with your team and our clients, from data owners and users to C-level executives, to address their needs and build impactful analytics solutions.
Requirements
- Degree in computer science, engineering, mathematics, or equivalent experience
- 2+ years of relevant professional experience
- Ability to write clean, maintainable, scalable and robust code in an object-oriented language, e.g., Python, Scala, Java, in a professional setting
- Proven experience building data pipelines in production for advanced analytics use cases
- Experience working across structured, semi-structured and unstructured data
- Exposure to software engineering concepts and best practices, inc. DevOps, DataOps and MLOps would be considered a plus
- Familiarity with distributed computing frameworks (e.g. Spark, Dask), cloud platforms (e.g. AWS, Azure, GCP), containerization, and analytics libraries (e.g. pandas, numpy, matplotlib)
- Commercial client-facing or senior stakeholder management experience would be beneficial