Senior Software Engineer (Agentic Search) - Crawler
Role details
Job location
Tech stack
Job description
We are looking for a Senior Software Engineer to work on the content acquisition and crawling infrastructure of a novel search engine tailored for agentic AI consumption.
In this role, you will focus on building systems that discover, fetch, and continuously refresh content from the open web and other large-scale data sources. You will design distributed crawling, scheduling, and ingestion infrastructure capable of operating at internet scale while balancing coverage, freshness, resource efficiency, and reliability. You will work on systems that process billions of URLs, manage high-throughput data flows, and ensure that high-quality content is consistently available to downstream indexing and retrieval systems.
In this position, your responsibility will be to:
- Design, implement, and operate web-scale crawling systems for acquiring content from the internet
- Build ingestion workflows for internal and external data sources, including crawlers, structured feeds, and partner integrations
- Develop crawl scheduling, prioritisation, recrawl policies, and freshness strategies
- Build systems for URL discovery, deduplication, content extraction, and crawl orchestration
- Ensure reliable operation of crawling infrastructure under high-throughput conditions
- Define observability and quality metrics for crawl coverage, freshness, throughput, and content quality
- Monitor resource usage, bandwidth consumption, and infrastructure cost
- Collaborate with indexing and ML teams to ensure acquired content meets retrieval and ranking requirements
- Enable safe experimentation with crawling strategies and content acquisition policies
Requirements
Do you have experience in Spark?, * 5+ years of experience building backend or distributed systems
- Strong Go or C++ expertise
- Experience with large-scale distributed systems (10k+ RPS, billions of URLs, high-throughput pipelines)
- Understanding of web protocols (HTTP, DNS, TLS), crawling, scraping, and content extraction
- Experience operating production systems and debugging failures in distributed environments
- Strong understanding of scalability, fault tolerance, and resource management
Strong candidates may also have experience with:
- Web crawling
- Building streaming data pipelines and event-driven systems
- Kafka, Pulsar, NATS, RabbitMQ, or similar messaging platforms
- Designing distributed schedulers, queues, and asynchronous processing systems
- Spark, Flink, Beam, or MapReduce
- Ad tech, social networks, search engines, or other large-scale content platforms, Applicants must be authorized to work in the country in which they apply and will be required to provide proof of employment eligibility as a condition of hire.
Benefits & conditions
- Competitive compensation
- Career growth and learning opportunities
- Flexibility and ownership
- Collaborative and innovative culture
- Opportunity to work on impactful AI projects
- International environment and talented teams
What's it like to work at Nebius:
Fast moving - Bold thinking - Constant growth - Meaningful impact - Trust and real ownership - Opportunity to shape the future of AI