Senior Data Engineer
Role details
Job location
Tech stack
Job description
You are a Senior Data Engineer who thrives on complexity and scale. You don't just build pipelines; you define the data ontology for the entire Mail organization. You are a strategic thinker who can influence event instrumentation at the source while collaborating with Data Science and ML teams to deliver high-impact analytics., * Architect: Partner with Data Science, Product, and Engineering to define the global data ontology for Yahoo Mail and lead the technical roadmap for core datasets. * Build Scalable Systems: Design, build, and maintain high-reliability batch and streaming data pipelines that populate our mission-critical data lakehouse. * Innovate Tooling: Develop automated frameworks and self-service tools that streamline how users interact with data products across the company. Leveraging autonomous AI agents and AI Tooling. * Data Governance: Establish standard methodologies for data operations, lifecycle management, and strict SLA management for all datasets within your ownership area. * Optimize Performance: Improve existing large-scale data infrastructures by applying advanced algorithmic concepts to optimize code and underlying data system stacks. * Collaborate: Act as a data consultant for complex cross-functional projects, ensuring integrated solutions across Mail engineering teams.
Requirements
- Education: BS/MS in Computer Science, Engineering, or a relevant technical field.
- Experience: 6+ years of experience building scalable ETL/ELT pipelines using industry-standard orchestration (Airflow, Composer, or Oozie).
- Technical Mastery: Deep expertise in SQL, PySpark, or Scala.
- Scale: Proven track record of managing Multi-Terabyte/Petabyte datasets and solving large-scale challenges (e.g., skew mitigation, data sketches, and accumulation patterns).
- Cloud & DevOps: Professional experience with at least one major cloud provider (GCP, AWS, or Azure) and a strong command of GitOps workflows (CI/CD, PRs).
- Compliance: Experience working within GDPR and other data privacy frameworks.
- Soft Skills: Exceptional communication skills with the ability to prioritize tasks in a high-pressure, fast-paced environment.
Preferred Qualifications
- GCP Expertise: 3+ years of experience with Google Cloud Platform (BigQuery, Dataproc, Dataflow, Composer, Looker).
- AI/ML Alignment: Experience building data features specifically optimized for Machine Learning models and AI applications.
The material job duties and responsibilities of this role include those listed above as well as adhering to Yahoo policies ; exercising sound judgment ; working effectively, safely and inclusively with others ; exhibiting trustworthiness and meeting expectations ; and safeguarding business operations and brand integrity.
Benefits & conditions
The compensation for this position ranges from $128,250.00 - $266,875.00/yr and will vary depending on factors such as your location, skills and experience.The compensation package may also include incentive compensation opportunities in the form of discretionary annual bonus or commissions. Our comprehensive benefits include healthcare, a great 401k, backup childcare, education stipends and much (much) more.