Software Engineer III
Role details
Job location
Tech stack
Job description
As a Software Engineer III specializing in data extraction, you willbe responsible forthe end-to-end lifecycle of web-based data collection. This includes designing scalable crawling architectures, reverse-engineering web applications toidentifydata points, and implementing evasion techniques to bypass IP rate-limiting and bot detection. You will also manage the storage and integrity of this data using advanced SQL and relational database management., * Spider Development: Design and deploy robust, distributed spiders and crawlers to extract data from a variety of web architectures (SPAs, SSR, etc.).
- Bot Evasion Engineering: Research and implement strategies to bypass anti-scraping technologies, including proxy rotation, browser fingerprinting, and CAPTCHA solving.
- Database Management: Create andoptimizeSQL schemas for large-scale data storage and perform complex data transformations and validation.
- System Maintenance: Proactively monitor the health of extraction agents and refactor code quickly in response to target website updates or layout changes.
- Performance Optimization: Utilize asynchronous Python programming to maximize the throughput and efficiency of data collection pipelines.
Requirements
We are looking for a Senior Python Engineer with a "hacker" mindset to join our team as a Software Engineer III. This role is dedicated to large-scale web scraping and data harvesting. If you have deep experience with Scrapy or Playwright, know how to defeat Cloudflare orDataDome, and can write high-performance SQL to manage millions of records, we want to hear from you. This is a specialized role for an engineer who enjoysreverse-engineeringthe web to unlock data., * Advanced Python: Mastery of Python 3.x with deep experience in extraction frameworks (Scrapy, Playwright, Selenium, or Puppeteer).
- Technical Resilience: Proven ability to bypass high-level bot detection (e.g., Cloudflare, Akamai, orPerimeterX).
- Database Mastery: Expert-level SQL skills and experience managing relational databases like PostgreSQL or MySQL.
- Network Proficiency: Expert understanding of HTTP/S, TCP/IP, TLS fingerprinting, and browser-header manipulation.
- Problem Solving: A specialized ability to reverse-engineer JavaScript-heavy websites and hidden API endpoints.
- Able to write, debug, and deploy complex Python code in a distributed environment.
- Must be able to analyze and interpret complex web structures and network traffic using browser developer tools.
- Ability to design and maintain relational database tables containing millions of rows.
- Able to pivot and respond quickly to technical "break-fixes" to ensure data continuity for the business.
- Collaboration with data analysts to define and validate data requirements and output formats.
Education & Experience:
- Bachelor's degree in Computer Science, Information Systems, or a related field (or equivalent professional experience).
- Minimum of 5+ years of experience in Software Engineering, with at least 2-3 years focused specifically on large-scale web scraping or data extraction.
Benefits & conditions
Benefits at Babel Street (just to name a few...)
- Health Benefits: Babel Street covers 85-100% monthly premium costs for Medical, Dental, Vision, Life & Disability insurances - for you and your family!
- Retirement Plans:Babel Street offers both a Traditional and Roth 401(K) with a very competitive match.
- Unlimited Flexible Leave: We trust our employees to manage their own time and balance their personal and work lives.
- Holidays: Babel Street provides employees with 12 paid Federal Holidays
- Tuition Reimbursement: We are committed to investing in our employees. One way we do that is with our Tuition Reimbursement Program for continuing education.