How is cohort intelligence different from AI stock prediction?

Point-prediction stock forecasting returns a single number (e.g. 'NVDA +2.3% over 5 days') with no uncertainty bounds, regime context, or explanation. Cohort intelligence returns 300 historical analogs, the full distribution of what they did next (median, p10/p90, win rate), and the features that separated winners from losers — a distribution and an explanation, not a guess.

Why does cohort intelligence work better than prediction for AI agents?

AI agents need structured facts they can reason about. A point forecast gives an agent nothing to introspect. A cohort response (300 analogs, win rate 44%, p10/p90 -11%/+7%, low-volatility regime headwind) gives the agent calibrated facts it can weigh and combine. In a blind paired evaluation, AI agents with cohort intelligence tools beat identical agents without them 50-0 across 50 out-of-sample scenarios.

Is cohort intelligence the same as k-NN regression?

Mechanically it uses k-NN retrieval. The substantive differences are: a self-supervised learned embedding space (256-dim, trained on 25M chart-pattern embeddings) rather than hand-engineered features, a split conformal calibration layer so empirical band coverage matches nominal, and metadata-conditioned filtering by regime, sector, news context, and time-since-earnings.

What does a cohort intelligence API response contain?

Four components: (1) the cohort itself — the 300 historical chart patterns most similar to the anchor, by symbol and date; (2) the full forward-return distribution at 1, 5, and 10 days (median, mean, p10, p90, win rate, std); (3) feature attribution — which features separated winners from losers within the cohort; (4) regime stratification — how the cohort splits by current market regime.

How do I use cohort intelligence in an AI agent?

Install the MCP server with `pip install chartlibrary-mcp`. Configure your MCP-aware agent (Claude Desktop, Cursor, Cline, custom) to load it. The agent can then call `cohort_analyze` with a (symbol, date, timeframe) anchor and get back the full Layer 3 intelligence response. REST and Python SDK paths are also available.

Does cohort intelligence work as an API?

Yes. Chart Library exposes cohort intelligence as a REST API and as an MCP server (`pip install chartlibrary-mcp`) wired into Claude, Cursor, and other MCP-aware agents. Free Sandbox tier supports text search and follow-through endpoints; full cohort intelligence (`/api/v1/cohort_analyze` and related Layer 3 endpoints) is part of the paid Builder tier from $29/mo.

Concept · Pillar

Cohort intelligence —
what it is, why it beats stock prediction.

Cohort intelligence is the practice of answering “what did this chart pattern do next?” by retrieving the cohort of historical analogs to a (symbol, date, timeframe) anchor and reporting the full distribution of what those analogs realized at 1, 5, and 10 days forward — together with the features that separated winners from losers, the regimes the cohort lived in, and conformal-calibrated probability bands.

It is the alternative to point-prediction stock forecasting (“NVDA: +2.3% over 5 days”) — a different shape of answer, designed for AI agents and quantitative researchers who need calibrated facts they can reason about, not opaque numbers they have to trust.

In one paragraph

Cohort intelligence retrieves the 300 historical chart patterns most similar to a stock at a given moment, then returns what those analogs actually did next — full forward-return distributions at 1/5/10 days, the features that separated winning analogs from losing ones, and how outcomes split by market regime. Built on 25M+ self-supervised pattern embeddings spanning 10 years of minute-bar data, drawn from a 19,000+ US-equity universe, calibrated with split conformal prediction so the 80% probability bands actually cover at 80% on held-out anchors. It is the methodology-honest alternative to AI stock prediction, designed to give AI agents structured facts they can reason about — not point forecasts they have to trust.

Independently validated — 50–0 in a blind paired AI-agent evaluation

The case for cohort intelligence over point prediction is not just methodological. It’s measured. In a blind LLM-judged paired evaluation across 50 out-of-sample scenarios (2024–2025), identical Claude Haiku agents — same model, same prompt, same scenarios — were given different toolkits. Agent A got OHLC bars and news headlines. Agent B got the same plus Chart Library’s cohort intelligence tools. Agent B won 50/50.

All six reasoning dimensions improved with paired t-statistic greater than 10 on every one. Investigation quality: +2.75. Evidence use: +1.88. Reasoning rigor: +1.40. Probability of a 50–0 sweep under the null hypothesis: < 1 in 10¹⁵. Methodology, results JSON, judge rationales, and case studies are open at chartlibrary.io/evaluation and github.com/grahammccain/chart-library-adqe.

The takeaway: cohort intelligence is the right primitive for AI agents reasoning about markets. Anyone can replicate the test.

Why point forecasts fail

The point-prediction era of AI-for-stocks made three implicit claims, all of which turn out to be false:

We can predict where a stock will go. Markets are weakly predictable in distribution. They are not predictable in point. The variance of single-day returns dominates any expected-value signal a model can extract.
More data and bigger models will eventually make point forecasts precise. Better models reduce variance. They do not reduce the irreducible market uncertainty point forecasts ignore by construction.
You should trust the prediction. A model that returns “+2.3%” with no error bars is asking you to trust a number you can’t audit, generated by a model you can’t introspect, against a future you can’t verify until weeks later.

Cohort intelligence inverts each. It doesn’t predict — it retrieves. The 300-analog cohort is the answer; there’s no point estimate to be precise about. And every claim ties to a verifiable retrieval the user can independently audit.

Cohort intelligence vs the alternatives

Five common approaches to “what will this stock do next?” — and how cohort intelligence differs from each.

vs point-prediction stock forecasting

Point forecast: “NVDA +2.3%.” Cohort: “300 analogs, median +0.4%, p10/p90 -4.2%/+5.6%, win rate 54%, top winner-separating feature: low volatility regime (negative).” A point forecast hides uncertainty; a cohort surfaces it.

vs traditional technical analysis pattern recognition

TA pattern recognition labels charts (“bull flag,” “head and shoulders”) and applies generic rules. Cohort intelligence skips the label and goes straight to the empirical question: what did this specific shape do, in this specific market regime, in actual historical data?

vs LLM market commentary

An LLM asked “what happens after a chart like NVDA on 2024-08-05?” hallucinates plausibly. A cohort intelligence call grounds the LLM in 300 verifiable historical analogs and lets it reason from facts.

vs single-stock backtesting

Single-stock backtests have ~10 years of one symbol — maybe 2,500 trading days. Cohort intelligence draws from ~25 million cross-symbol patterns covering the same 10 years across the US-equity universe. 10,000× more analog density.

vs proprietary “AI stock pick” ranking models

Ranking models (K-score, AI grades, etc.) compress everything to an opaque scalar and hide the reasoning. Cohort intelligence externalizes the reasoning surface — the user (or the agent) can read the actual cohort, audit similarity, check regime context.

What a cohort intelligence response actually contains

Anchor a (symbol, date, timeframe). Say: NVDA on August 5 2024 at the 1-hour timeframe. The response has four components.

1. The cohort itself

The 300 historical chart patterns most similar in shape to NVDA on that date. Real patterns from real symbols on real historical dates — concrete things the user can audit. Could be PFE on 2019-03-12, RIO on 2022-08-08, AMD on 2017-04-14. Whatever was actually most similar in the embedding space.

2. The full forward-return distribution

What did those 300 analogs do over the next 1, 5, and 10 trading days? Median, mean (and trimmed mean, robust to outliers), p10 and p90, win rate. Not a single number — a distribution. For NVDA on 2024-08-05 at 1h, the actual 5-day distribution: median −1.3%, p10/p90 of −11.3%/+6.8%, win rate 44%.

3. Feature attribution

Within the cohort, which features separated the winners from the losers? For our example anchor: tight credit spreads were a positive factor (analogs that occurred during tight credit outperformed), bullish macro state was positive, low vol regime was negative. This is conditioning information — the user can ask, does the live anchor share the positive features?

4. Regime stratification

How does the cohort split by current regime? In low-vol regimes this cohort had a 38% win rate; in high-vol, 51%. Same cohort, different stories depending on which regime we’re in today.

The four engineering disciplines

Cohort intelligence is conceptually simple: vector search + outcome lookup. The work is in keeping it methodology-honest.

1. Embeddings

A useful similarity function for chart patterns is the load-bearing piece. Hand-engineered features fail — too many degrees of freedom. We trained 256-dimensional self-supervised embeddings on minute-bar data: ~25M chart pattern embeddings spanning 10 years of history, drawn from a 19,000+ US-equity universe. Critically, we don’t condition the embedding on forward returns — that would be a leak.

2. Cohort hygiene

When you retrieve nearest neighbors of NVDA · 2024-08-05 · 1h, several adjacent days for NVDA itself look very similar. Including those adjacent days produces a meaninglessly tight cohort that’s secretly almost-the-same-anchor. We exclude same-symbol matches within ±10 calendar days.

3. Calibration

Raw retrieval gives nominal probability bands but empirical coverage is usually wrong. Split conformal correction widens the bands so the actual coverage hits 80% on held-out anchors.

4. Eval discipline

Symbol-disjoint splits (NVDA in train means no NVDA in test). 10-day embargo windows. Honest negatives published. And the paired-agent eval at /evaluation measures whether the layer actually improves agent reasoning — not just whether the embeddings recall similar charts.

When to use cohort intelligence

Cohort intelligence is the right primitive when the question has this shape:

An AI agent needs to reason about a specific (symbol, date) anchor. Trading agents, research agents, alert-summary agents. The cohort response gives the LLM facts to ground in.
You want a calibrated forward-return distribution, not a point estimate. Risk modeling, position sizing, scenario analysis.
You want to explain a chart pattern’s historical behavior. Investor letters, research notes, sales conversations grounded in “here’s what 300 historical analogs actually did.”
You’re building a screener that asks “what setups look interesting tomorrow?” Cohort-ranked discovery surfaces setups with cleanly defined historical analog density.
You need a regime-aware view of a known pattern. Bull flags performed one way in 2021, another way in 2022. Regime stratification tells you which.

It is not the right primitive when you need a strict trading signal (cohort intelligence informs decisions; it doesn’t make them), a fundamental valuation (different shape of data), or millisecond-latency order routing (cohort calls average ~280ms).

Why this is the right primitive for AI agents

LLM-based trading and research agents need facts they can reason about. Cohort intelligence is fact-shaped:

cohort_size: 300 — the agent can reason about sample size
median_5d: -1.3%, win_rate: 0.44 — the agent can describe a distribution, not a guess
credit_spread_state=tight (positive) — the agent can check whether that factor is currently present and conditionally update
conformal coverage: 80% empirical — the agent can express calibrated uncertainty

Compare to handing an agent “+2.3% NVDA forecast.” The agent has nothing to reason about. And in the paired-agent evaluation, the cohort-equipped agent didn’t just produce different answers — it produced answers a senior PM would actually defend. That’s the bar.

Try it

Cohort intelligence is exposed via REST API and as an MCP server. The simplest entry point is the MCP tool — install once, wired into Claude, Cursor, or any MCP-aware agent:

pip install chartlibrary-mcp

From any MCP-aware agent:

# In Claude Desktop, Cursor, or a custom MCP client:
> What's the historical cohort for NVDA on 2024-08-05 at 1h?

# The agent calls cohort_analyze and returns:
# - 300 historical analogs
# - 5d median return, p10/p90, win rate
# - top 3 features that separated winners from losers
# - regime stratification

Or as a direct REST call. The full cohort intelligence endpoints (/api/v1/cohort_analyze, narrative_pulse, cohort_compare) are part of the Builder tier ($29/mo). Free Sandbox tier covers text search, follow-through, and the public discovery surfaces. Grab a key at chartlibrary.io/developers:

curl -X POST https://chartlibrary.io/api/v1/cohort_analyze \
  -H "Authorization: Bearer cl_..." \
  -H "Content-Type: application/json" \
  -d '{
    "anchor": {"symbol": "NVDA", "date": "2024-08-05", "timeframe": "1h"},
    "cohort_size": 300,
    "horizons": [1, 5, 10]
  }'

What cohort intelligence is not

Not a trading signal. A 60% win-rate cohort doesn’t mean “buy this stock.” That’s information; what you do with it is your decision.
Not a guarantee. Historical pattern similarity is a strong prior, not a forecast. Regime shifts can break the prior.
Not a replacement for fundamental analysis. It’s a complement, not a substitute.
Not a black box. Every claim in a cohort response ties to a verifiable retrieval — the user can inspect the actual 300 analogs and check the math.

Frequently asked questions

What's the smallest useful cohort size?: We default to 300 historical analogs. Below n=30 the distribution stats are too noisy to be meaningful; the API surfaces a warning when filtered cohort drops below that floor.
Can I filter the cohort by regime, sector, or news context?: Yes. cohort_analyze accepts filters for vol_regime, days_since_earnings, days_since_ath, sector, has_news, macro_state, relative_volume, and realized_vol. Filters narrow the cohort to comparable historical situations.
How fresh is the data?: Daily bars are ingested nightly; new pattern embeddings are computed and indexed for every trading day. Same-day intraday queries use the most recent close.
Does cohort intelligence work for crypto, forex, or commodities?: Currently US equities only — 19,000+ tickers including delisted (no survivorship bias). Crypto and global equities are a future expansion.
What's the latency on a cohort intelligence call?: ~280ms median for /api/v1/cohort_analyze with default cohort_size=300. The full Layer 3 response (cohort + outcome distribution + feature importance + regime stratification + risk profile) is computed and returned in one round trip.
How does cohort intelligence avoid look-ahead bias?: Each retrieval respects an as_of_date — analogs are filtered to dates strictly before the anchor, and outcome distributions are computed only from those analogs' realized forward returns at the time. Same-symbol matches within ±10 calendar days are excluded to prevent trivially-similar adjacent days from collapsing the cohort.
Can I commercially use cohort intelligence in my AI agent product?: Yes. Sandbox (free) for evaluation; Builder ($29/mo) unlocks cohort_analyze and the full Layer 3 endpoints for commercial agent workloads; Scale ($99/mo) for higher throughput; Agent ($299/mo) for high-volume orchestration; Enterprise (from $2K/mo) for funds and embedded use cases.
Has cohort intelligence been independently validated?: Yes. A blind LLM-judged paired evaluation across 50 out-of-sample scenarios (2024-2025) showed AI agents with Chart Library's cohort intelligence tools beat identical agents without them 50-0 in winner consensus, with paired t-statistic > 10 on all six reasoning dimensions. Full methodology and code are open at chartlibrary.io/evaluation and github.com/grahammccain/chart-library-adqe.