Machine Learning Engineer, Web Indexing Team
Role details
Job location
Tech stack
Job description
As part of this group, you will work within one of the most exciting high-performance computing environments, with petabytes of data and millions of queries per second, and have the opportunity to imagine and build products that delight our customers every single day., We design and build infrastructures to support features that empower billions of Apple users through advanced intelligence systems. Our team processes trillions of links to find the best content to surface to users via search and other intelligent features. We also analyze pages to extract critical features for indexing, ranking, and retrieval. We apply statistical analysis to enhance link selection, content freshness, retrieval rates, and extraction quality, among many other aspects. Furthermore, we are building a generic Retrieval-Augmented Generation (RAG) indexing infrastructure framework to allow indexing building customization and support fast experiment iteration. You'll have the opportunity to dive deeper into RAG systems, as well as large-scale data processing, managing trillions of records, petabytes of data, and the incredible complexity behind Apple Intelligence's products.
Requirements
- 7+ years of software engineering experience, with a strong focus on large-scale distributed systems or infrastructure
- Strong coding proficiency in one or more languages (e.g., Python, Java, Go, C++)
- Solid foundation in computer science fundamentals including algorithms and data structures
- Hands-on experience with large-scale data processing and MapReduce-style frameworks (e.g., Spark, Hadoop)
- Experience with cloud services, particularly AWS (S3, EC2, EKS) and/or Kubernetes-based orchestration
- Proven ability to operate independently and drive projects end-to-end in a collaborative team environment
- MS or PhD in Computer Science or a related field, or equivalent practical experience, * 10+ years of software engineering experience, with demonstrated impact at a senior or staff level
- Experience designing and building RAG systems or other AI/ML indexing pipelines
- Background in machine learning, natural language processing, or information retrieval
- Experience managing systems operating at petabyte scale with trillions of records
- Strong communication skills with a track record of influencing technical direction across teams
- MS or PhD in Computer Science or a related field, or equivalent practical experience