Principal Data Engineer
Role details
Job location
Tech stack
Job description
- Lead enterprise-level data architecture and platform strategy, establishing standards, best practices, and scalable frameworks across multiple projects and teams.
- Design and deliver robust, production-grade data pipelines supporting batch, streaming, and real-time use cases with built-in data quality, observability, and governance.
- Architect and enable data foundations for AI and LLM-driven applications, including unstructured data ingestion, embedding pipelines, and vector-based retrieval systems.
- Own end-to-end technical program execution, breaking down complex initiatives into structured deliverables while aligning cross-functional teams to business outcomes.
- Serve as the primary technical advisor in client-facing engagements, translating ambiguous requirements into clear architectural solutions and technical roadmaps.
- Define and implement cloud data strategies within AWS, including data lakes, warehouses, streaming platforms, and AI-ready infrastructure.
- Establish data contracts, governance frameworks, and performance standards to ensure reliability, scalability, and consistency across platforms.
- Drive DevOps and DataOps best practices, including infrastructure-as-code, automation, and cost optimization strategies.
- Mentor and develop engineering talent, providing technical guidance, conducting code and architecture reviews, and elevating overall team capability.
- Contribute to business development efforts by supporting solution design, technical proposals, and client presentations as a senior technical voice.
Requirements
Our client is seeking a highly strategic and hands-on Principal Data Engineer who thrives in complex, ambiguous environments and can translate business needs into scalable, production-ready data and AI solutions. This individual will operate as both a technical authority and program leader, driving enterprise data architecture, mentoring engineering teams, and delivering high-impact solutions across multiple workstreams. The ideal candidate brings a strong blend of deep technical expertise, systems thinking, and the ability to influence both technical and executive stakeholders., * 7+ years of hands-on data engineering experience, including at least 3 years in a lead, architect, or principal-level role.
- Proven expertise designing and deploying scalable ETL/ELT pipelines, data platforms, and streaming architectures in production environments.
- Deep experience with AWS data services such as S3, Redshift, Glue, Lake Formation, Athena, EMR, Kinesis, Lambda, and Step Functions.
- Demonstrated experience supporting AI/ML or LLM data workflows, including data preprocessing, embedding generation, and vector database integration.
- Strong background in solution architecture and client engagement, with the ability to translate business needs into technical designs.
- Experience leading technical programs, including planning, estimation, and managing workstreams using tools such as Jira.
- Advanced understanding of data governance, lineage, and data quality best practices within enterprise environments.
- Proficiency with modern data tooling such as Databricks, dbt, and orchestration frameworks (e.g., Airflow or Prefect).
- Experience with infrastructure-as-code and DevOps practices using tools such as Terraform or similar frameworks.
- Bachelor's degree in Computer Science, Data Engineering, or a related field, or equivalent practical experience.