Web Scraping Engineer
Role details
Job location
Tech stack
Job description
- Morning (7am ET). Review overnight pipeline logs. Identify failures, anomalies, or coverage gaps. Triage fixes and follow-ups.
- Engineering work. Fix PHP bugs. Update scrapers as target sites change their structure or defenses. Update cron logic. Ship incremental improvements to collection coverage and quality.
- Data quality. Run checks against current and historical baselines to confirm coverage and accuracy.
- Client questions. Respond to client inquiries about coverage, methodology, or anomalies as they come in.
- Strategy. Periodically discuss collection strategies, help scope and stand up new datasets, and contribute to new products and features.
This is operational work with a steady rhythm. The reward is in keeping an important data product running well, and in being good at a specific kind of hard problem (scraping hard sites at scale) that few people are good at., * Production PHP experience. CodeIgniter 4 is a strong plus. You must be able to point to public PHP work: a repo, contributions to a project, a blog post, or similar.
Requirements
Do you have experience in Python?, * Python proficiency with modern scraping libraries. Working fluency in Playwright, Scrapy, Selenium, Requests, httpx, BeautifulSoup, or comparable. Real scraping work lives in this toolkit.
- Demonstrated experience scraping hard targets at scale. Sites with active anti-bot defenses, dynamic rendering, rate-limit walls, or aggressive blocking. You must include a link to public scraping work in your application, or describe a specific scraper you built in detail (target, defenses encountered, how you solved them).
- MySQL competence. Reading and writing non-trivial queries against tables with hundreds of millions of rows.
- Schedule. 7am US Eastern start.
- US-based with verifiable employment history.
Strong plusses
- Direct experience with anti-bot evasion: residential proxies, TLS fingerprint matching, JA3, header rotation, CAPTCHA strategy.
- Comfort with mature, incrementally maintained codebases.
- Background in financial data, alternative data, or equity research.
- Node.js, Puppeteer, or additional automation tooling., * production PHP: 3 years (Required)
- production web scraping: 2 years (Required)
Benefits & conditions
$120,000 - $140,000 a year - Full-time, Contract, Pulled from the full job description
- Flexible schedule, * Flexible schedule