OPS-25-C2Agent Operations

Autonomous research agent for a financial intelligence firm

Goal-driven agent crawling filings, press, social and internal sources — producing structured analyst briefings every morning before 7 AM ET.

−72%analyst hours
Sector
Finance · Equity research
Surfaces
Agents · LLM · Data
Runtime
8 months autonomous
Published
2025-04-15

Challenge

The client — a boutique equity research firm with 14 analysts and a 220-ticker coverage — spent each morning (5:00–8:30 AM ET) manually pulling data from filings (SEC EDGAR), press, conference transcripts, social channels and internal databases. Before markets opened, every analyst needed briefings on their 12–18 tickers.

An attempt to replace this with ChatGPT ended in chaos — the model confabulated, mixed up tickers, had no PDF access, hallucinated numbers. The client needed a system that delivers facts, not essays.

Approach

We built an agent with three layers: tool layer (12 tools: SEC fetch, PDF extraction, press search, Twitter/X scraping, historical DB lookup, financial calculations), reasoning layer (Claude as planner — picking tools and their order per ticker), and output layer (structured JSON, schema-first validated, rendered as a Markdown briefing).

Every briefing passes validation: every number has a source citation, every claim has a date, no internal contradictions. Failed validations land in a queue for the analyst with concrete pointers on what went wrong.

Evaluation: we built a 280-historical-briefing test set with the client (rated S/A/B/C/F), and the agent must average above A- before any production change. Three prompt and tool iterations before the first deployment.

Outcome

Briefings ready every day by 6:40 AM ET. 220 tickers covered, average briefing quality (assessed weekly by clients) between A- and A. Hallucinations (strictly assessed): 0.4% of claims, each caught by validation.

Analysts saved an average of 72% of the time previously spent on data collection. That time goes into analysis, client calls, and less obvious insights.

Operating cost: ~$3,200/mo (mostly Anthropic API). The client cancelled two junior associate research hires they had been searching for — monthly saving ~$22,000.

Stack

Claude (Anthropic)LangGraphPythonPyMuPDFPostgrespgvectorTemporalNext.js dashboard

Metrics

  • 220Tickers covered
  • 6:40 ETBriefing ready by
  • −72%Time saved
  • 0.4%Hallucination rate
  • $3.2k/moOperating cost
  • A-Avg briefing grade
Similar problem in your business?

Every project is different, but patterns repeat.

If you recognise pieces of this case study in your own situation — write. We usually see in the first call whether it is hours-per-week scale or months of infrastructure.