Senior Software Engineer

Yahoo

Richardson, United States of America

2 days ago

Role details

Contract type

Permanent contract

Employment type

Full-time (> 32 hours)

Working hours

Regular working hours

Languages

English

Experience level

Senior

Compensation

$ 299K

Job location

Richardson, United States of America

Tech stack

Java

Artificial Intelligence

Data analysis

Big Data

Google BigQuery

Cloud Storage

Data Governance

ETL

Data Security

Data Systems

Data Warehousing

Distributed Systems

Elasticsearch

Data Flow Control

Hadoop

MapReduce

Hive

Python

Machine Learning

Software Architecture

TensorFlow

Software Engineering

Data Streaming

Data Processing

Freeform SQL

Google Cloud Platform

Data Storage Technologies

Data Ingestion

Spark

Yahoo Mail

Information Technology

Kafka

Data Management

Data Pipelines

Apache Beam

Amazon Web Services (AWS)

Service Stack

Job description

Yahoo serves as a trusted guide for hundreds of millions of people globally, helping them achieve their goals online through our portfolio of iconic products. For advertisers, Yahoo Advertising offers omnichannel solutions and powerful data to engage with our brands and deliver results.

The ideal candidate will have strong distributed systems knowledge and AI/ML experience to design, build, and optimize scalable data pipelines, and infrastructure that power advanced analytics and machine learning solutions. In this role, you will collaborate closely with software engineers, product owners and business stakeholders to prepare and transform large datasets(realtime pipelines), support end-to-end development and deployment, and ensure robust, efficient, and secure data flows. You will leverage your expertise in cloud platforms, big data tools, and machine learning frameworks to drive innovation and deliver actionable insights that advance our organization's AI initiatives and business objectives.

Little About Us:

Yahoo's Central Data team manages massive scale (100+ petabyte) data systems to glean insights on Yahoo! products and to improve the experience for its 1B+ user base. The team provides the foundations for the user engagement data collection and processing for all of Yahoo's users, Operational Excellence, Anomaly detection and Governance across the organization. Your work will directly influence product changes and you will work on a team of talented and motivated engineers to improve the user experience on popular Yahoo! sites and apps like Yahoo Mail & Homepage, Yahoo Sports, Yahoo Finance, Yahoo News and many other new products.

A Lot about You:

Apply software engineering expertise to build high-performance, scalable data warehouses.
Be excited to learn and take ownership for large-scale projects spanning many tech stacks and environments.
Design, build, and launch efficient & reliable data pipelines to move and transform data on the scale of multiple petabyte(s) using the latest technologies.
Build real time analytics and ingestion pipelines capable of processing more than a million events per second and provide insights at sub-second latency
Interact with product owners and end users to understand and solve new business requirements as they emerge.
Design and audit processes for ensuring the delivery of high-quality data through rigorous QA checks
Have excellent data modeling skills to understand the nuances of various dimension and metric types in warehouses.
Design workflows to ingest, load and present new data sets for users.
Provide active support, be on rotation for on-call support on production pipelines (typically a couple of times each quarter).
Define and manage SLA for all data sets in allocated areas of ownership.
Work with the production engineering / infrastructure team to drive resolution to production issues., * Experience or familiarity with some of the following tools: Kafka, Storm, Streaming (Spark,Dataflow), ElasticSearch
Design, build, and maintain scalable data pipelines and ETL processes to support machine learning and AI initiatives on Google Cloud Platform (GCP).
Implement and optimize data storage solutions using GCP services such as BigQuery, Cloud Storage, and Dataflow.
Ensure data quality, integrity, and security throughout the data lifecycle.
Collaborate with data scientists, analysts, and business stakeholders to understand data requirements and deliver actionable insights.
Monitor, troubleshoot, and maintain the health and performance of cloud-based data infrastructure.
Automate manual processes and repetitive tasks to improve efficiency and reduce errors.
Apply data governance and compliance best practices to protect sensitive information and meet regulatory standards.
Stay current with new GCP features, tools, and best practices to continuously enhance data management capabilities.
Document solutions, processes, and architectural decisions to facilitate knowledge sharing and maintainability.
Experience working with either MapReduce or any other Parallel data processing system.
Experience with schema design and dimensional data modeling.
Comfortable writing complex SQL queries.
Strong data mindset with a deep appreciation for analyzing data to identify product gaps and enhancements to improve user engagement and revenue growth.
Excellent communication skills and ability to tell insightful stories using data and also manage communication within internal teams and stakeholders.

#LI -FM1

The material job duties and responsibilities of this role include those listed above as well as adhering to Yahoo policies ; exercising sound judgment ; working effectively, safely and inclusively with others ; exhibiting trustworthiness and meeting expectations ; and safeguarding business operations and brand integrity.

At Yahoo, we offer flexible hybrid work options that our employees love! While most roles don't require regular office attendance, you may occasionally be asked to attend in-person events or team sessions. You'll always get notice to make arrangements. Your recruiter will let you know if a specific job requires regular attendance at a Yahoo office or facility. If you have any questions about how this applies to the role, just ask the recruiter!

Requirements

BS/MS in Computer Science and/or Mathematics/Statistics
4+ years experience in relevant software development with at least 2 years of professional Java and/or Python experience
2+ years experience in the Big Data pipeline and analytics space with experience across technology stacks.
2+ years experience in custom ETL design using Big Data stack environments (Hadoop, MapReduce, Pig, Hive, AWS EMR, Apache Beam, Google Cloud Platform Dataflow, BigQuery), implementation and maintenance.

Benefits & conditions

The compensation for this position ranges from $143,625.00 - $299,375.00/yr and will vary depending on factors such as your location, skills and experience.The compensation package may also include incentive compensation opportunities in the form of discretionary annual bonus or commissions. Our comprehensive benefits include healthcare, a great 401k, backup childcare, education stipends and much (much) more.