Web Scraping Engineer: 65 open roles, 31 companies — the smallest niche with the most interesting work

There are only 65 open web scraping engineering roles in the world right now, across 31 companies. That's small. But the companies hiring are doing work that matters: AI training data curation, real-time market intelligence, LLM grounding data, anti-bot evasion research. The web scraping specialisation is tiny, technically demanding, and more consequential than its size suggests.

I track Web Scraping Engineer roles from public ATS feeds. Here's July 2026.

What web scraping engineers do

Web scraping engineers build systems that extract structured data from the unstructured web — at scale, reliably, and despite active countermeasures. The job requires a specific stack: async Python (httpx, aiohttp, Scrapy), browser automation (Playwright, Puppeteer), anti-bot evasion, proxy management, HTML parsing (BeautifulSoup, Parsel), and distributed crawl orchestration.

In the AI era, scraping has acquired new strategic importance. Large language models need training and grounding data — and much of that data lives on websites that have to be crawled. Companies building AI products with web-sourced context (RAG systems, AI search, real-time LLM knowledge) need engineers who can extract, clean, and pipeline web data at scale.

Anti-bot evasion has simultaneously gotten harder. Companies protecting their content have deployed increasingly sophisticated bot detection (Cloudflare Turnstile, DataDome, PerimeterX). The counter-arms race — human-realistic browser fingerprinting, residential proxy rotation, behavioral mimicry — is a genuine technical discipline, not a simple scripting task.

The data: 65 open roles across 31 companies

65 active roles. 31 companies. 1 new role in the last 7 days.

The market is small and relatively stable — 1 new role in the past week means the scraping-specialist job market doesn't see rapid job postings. These roles open when a company specifically needs scraping expertise; they're not posted and reposted continuously.

Who's hiring

Company	Open Scraping roles
HumanSignal	22
Grafana Labs	7
Oxylabs	2
Exa	2

HumanSignal at 22 is the standout — they build Label Studio, an open-source data labeling platform used heavily for AI training data. Their scraping engineering team works on data collection pipelines that feed labeling workflows; they're effectively a data infrastructure company for the AI industry.

Grafana Labs at 7 is more operational: they scrape monitoring data and run synthetic testing across distributed systems. The "scraping" in their case is systems monitoring and observability data collection rather than web extraction.

Oxylabs is a proxy infrastructure company — they build the tools that scraping engineers use, so their internal roles are on the infrastructure side: proxy network management, IP reputation systems, detection evasion research.

Exa (formerly Metaphor) is AI search — they build web search infrastructure that requires large-scale crawling and extraction as a core competency.

Salary

The active base of 65 roles yields only 6 roles with published salary data — too small a sample for meaningful statistics. We've omitted salary estimates here to avoid misleading inference from 6 data points.

For calibration: scraping engineers with strong Python, async, and browser automation skills typically command $140k–$200k at companies that actively hire for this role, with the high end at proxy infrastructure companies (Oxylabs, Bright Data) and AI data companies (Scale AI, HumanSignal) where scraping is core to the product.

What this tells us about the web scraping market

1. The specialisation is genuinely rare. 65 open roles worldwide means this is a discipline where supply constraints are real. Scraping engineers who can handle modern anti-bot evasion, distributed crawling at scale, and clean data pipeline design are not easy to find — and companies building AI data products will pay accordingly.

2. The AI connection is structural. HumanSignal and Exa both appear because their products are directly in the AI data layer — training data and AI search respectively. The growth of AI increases demand for scraped data. This is one of the few technical specialisms where the AI era creates more demand for the original human skill rather than automating it.

3. The proxy infrastructure layer is underexplored as an employer. Oxylabs, Bright Data, Smartproxy — these are 100–500 person companies that are effectively tech infrastructure companies for the scraping industry. Their engineering roles are technically interesting, well-compensated, and don't appear in most engineers' job search because they're not high-profile tech brand names.

4. Small market, high leverage. For engineers with scraping skills, the 31-company market means a targeted job search is efficient. You can reach every active hirer in this niche with a focused outreach campaign. The specificity of the skill also means your application looks very different from a generic software engineering candidate — which is an advantage, not a disadvantage.

The board

latchhire.com/board/scraping — updated daily. Every role links to the original posting.

Want new Web Scraping Engineer roles? Subscribe to job alerts →

Data pulled 2026-07-01. Active roles only. Salary section omitted — fewer than 10 roles with published salary data in this niche; numbers would be misleading.

→ Browse 65 open Web Scraping Engineer roles