Continuous lead intelligence system for a B2B SaaS sales team
Daily multi-source enrichment of 40k+ accounts, intent signals, decision-maker mapping. Hands-off since Q2 2024.
28 job boards, 14 countries, 3.2M active openings. Daily skills + salary trends. SaaS for HR-tech compass.
An HR-tech SaaS client (€8M ARR, 280 corporate clients) delivered compensation benchmarking and talent acquisition intelligence for HR teams. Their differentiator: comprehensive job market data — who is hiring, which skills, what salaries, which locations are hot. Clients used it for salary benchmarking, location planning, competitive intelligence.
State in 2024: survey-based methodology, data aggregated quarterly from partner companies. Coverage slow — corporate clients wanted answers to "what should I pay a senior React engineer in Berlin right now", but survey data was 3–6 months old, inconsistent across firms, and did not show granular trends.
An internal "let us scrape job boards" attempt in 2023 produced chaotic results: 8 job boards integrated, ad-hoc parsers, ~40% success rate, data not comparable across sources (every board reporting salary differently). After 4 months the client paused the project.
We designed a production-grade job market intelligence platform: 28 job boards across 14 EU countries (LinkedIn Jobs API + scrape, Indeed, StepStone, Pracuj.pl, Welcome to the Jungle, niche tech boards like StackOverflow Jobs, Wellfound, etc.).
Critical challenges:
Salary normalisation: every board reports salary differently — "competitive", "from X", range "X-Y", net vs gross, annual vs monthly, with/without benefits. LLM-driven normalisation (Claude Haiku for cost) parses text descriptions and extracts canonical "EUR annual gross, salary range, benefits flag".
Skills extraction: NLP pipeline (custom + Claude) pulls skills from job descriptions in 8 languages. Normalisation to ESCO skills taxonomy plus custom skills extended for tech roles.
Cross-board deduplication: the same opening appears on 5+ boards. Dedup heuristic: company + job title similarity + posting timestamp window + skills overlap.
Architecture: Temporal orchestration, Playwright for scrape (LinkedIn Jobs via API where possible), Claude Haiku for NLP normalisation (~$0.001 per job parsed), PostgreSQL canonical store, ClickHouse for time-series analytics, OpenSearch for full-text job search for client UI.
3.2M active openings monitored daily across 14 EU markets. Coverage estimate: 87% of publicly visible openings (measured vs ECB/Eurostat data).
Compensation benchmarking accuracy: client reports 23% improvement in predicted-vs-actual offer matching (vs survey-based baseline). Meaning: HR negotiating with a candidate uses platform data, offers are more accurate, accept rate up.
Market report time: pre-deployment, a custom report for a client took 8–12 days of analyst work. Post-deployment: the same report is generated automatically in 2–4 hours + analyst review (1–2h). 12× speedup.
Client portfolio growing: 280 corporate clients at start, 412 after 10 months. ARR grew from €8M to €13.2M. The client attributes the platform as a key driver.
System operating cost: $14k/month (LLM + proxy + compute), $4k/month retainer maintenance. ROI for the client: 28× in the first year post-deployment.
Daily multi-source enrichment of 40k+ accounts, intent signals, decision-maker mapping. Hands-off since Q2 2024.
Resilient scraping with anti-bot routing, SKU normalization and 5-minute price-change webhooks into the client's repricing engine.
Goal-driven agent crawling filings, press, social and internal sources — producing structured analyst briefings every morning before 7 AM ET.
If you recognise pieces of this case study in your own situation — write. We usually see in the first call whether it is hours-per-week scale or months of infrastructure.