Web Scraping Engineer

ALPHAMATICIAN LLC
7 days ago

Role details

Contract type
Permanent contract
Employment type
Full-time (> 32 hours)
Working hours
Shift work
Languages
English

Job location

Tech stack

PHP
CodeIgniter
Web Scraping
Python
MySQL
Node.js
Selenium
Captcha
Puppeteer (Software)
Playwright

Job description

  • Morning (7am ET). Review overnight pipeline logs. Identify failures, anomalies, or coverage gaps. Triage fixes and follow-ups.
  • Engineering work. Fix PHP bugs. Update scrapers as target sites change their structure or defenses. Update cron logic. Ship incremental improvements to collection coverage and quality.
  • Data quality. Run checks against current and historical baselines to confirm coverage and accuracy.
  • Client questions. Respond to client inquiries about coverage, methodology, or anomalies as they come in.
  • Strategy. Periodically discuss collection strategies, help scope and stand up new datasets, and contribute to new products and features.

This is operational work with a steady rhythm. The reward is in keeping an important data product running well, and in being good at a specific kind of hard problem (scraping hard sites at scale) that few people are actually good at.

Requirements

Do you have experience in Scalable systems?, * Production PHP experience. CodeIgniter 4 is a strong plus. You must be able to point to public PHP work: a repo, contributions to a project, a blog post, or similar.

  • Python proficiency with modern scraping libraries. Working fluency in Playwright, Scrapy, Selenium, Requests, httpx, BeautifulSoup, or comparable.
  • Demonstrated experience scraping hard targets at scale. Sites with active anti-bot defenses, dynamic rendering, rate-limit walls, or aggressive blocking. You must include a link to public scraping work in your application, or describe a specific scraper you built in detail (target, defenses encountered, how you solved them).
  • MySQL competence. Reading, writing, and optimizing non-trivial queries against tables with hundreds of millions of rows.
  • Schedule. 7am US Eastern start.
  • US-based with verifiable employment history.

Strong Plusses:

  • Direct experience with anti-bot evasion: residential proxies, TLS fingerprint matching, JA3, header rotation, CAPTCHA strategy.
  • Comfort with mature, incrementally maintained codebases rather than green-field environments.
  • Background in financial data, alternative data, or equity research.
  • Node.js, Puppeteer, or additional automation tooling.

About the company

Alphamatician is hiring a Web Scraping Engineer for an individual-contributor role at the heart of our data operations. The work is technical, operational, and quietly important: keeping a mature alternative-data product running reliably for institutional investors.

Apply for this position