Forecasting — plain-language explainer
Forecasting turns an uncertain people decision into numbers you can act on: a distribution over the outcome, and a dollar value on what knowing more would be worth.
A People Analytics Toolbox component. Built to the portfolio Explainer Standard v1.0. Every claim below is grounded in the spoke's own code and contracts (src/spokes/forecasting/, contract 1.4.0); anything not yet built is marked (TBD).
1. What is it?
Forecasting is a set of callable decision-math services: give it a model of an uncertain outcome and it returns either a full sampled distribution (Monte Carlo simulation) or the expected value of resolving the uncertainty before you commit (value of information — EVPI and discrete EVSI).
It is deliberately math and persistence only — not a dashboard. A consumer app sends a spec over HTTP (or MCP), the spoke runs the calculation with a reproducible seed, and returns a typed result the consumer renders. Around that core sit four stateless PA Instruments — interval scoring, Bayesian precision-combine, a measurement-method catalog, and a measurement recommender — composable primitives that other Toolbox products assemble into larger decision flows.
Visual — Tier B (typographic capability map). What the spoke exposes:
monte-carlo/run— sampled distribution over a closed-form model (mean, stdev, percentiles, raw draws).voi/compute— EVPI / EVSI on a discrete decision tree (what perfect or partial information is worth).decision-models— register / fetch a reusable decision tree.interval-scoring(PA Instrument) — score predicted intervals against actuals (Winkler).bayesian-combine(PA Instrument) — fuse several estimate sources into one posterior.measurement-recommend+measurement-catalog(PA Instruments) — rank measurement methods by value of information.
2. What problem does it solve — and why is it different?
The pain it removes: people decisions — a promotion, a program investment, a comp move, a hiring-pace bet — are made under genuine uncertainty, but the analysis usually collapses to a single point estimate in a spreadsheet, with the uncertainty either hidden or argued about informally.
The difference, stated as a shift:
- FROM one hand-built spreadsheet number, with the uncertainty in someone's head and no record of how it was produced.
- TO a sampled distribution (or an explicit dollar value on resolving the uncertainty), produced from a versioned API contract, with a deterministic seed so the same spec always reproduces the same draws and an optional audit row in Postgres.
How it differs from the obvious substitutes:
- vs. spreadsheet Monte Carlo — those re-roll differently every recalculation, can't be replayed, and live as untracked copies. Forecasting uses a seeded PRNG (Mulberry32), versioned request/response schemas, and writes auditable run rows (
monte_carlo_runs,voi_analyses). - vs. a full Bayesian / MCMC stack — intentionally narrow and fast: closed-form expression evaluation over named parametric distributions for planning scenarios, and exact discrete VOI on aligned decision trees. Explainable, not a black box.
- vs. generic BI — BI charts what happened; Forecasting prices what might happen and what it is worth to find out before deciding.
Visual — Tier B (FROM→TO shift block). The shift above is the visual; a rendered comparison page is a follow-up (FU-A).
3. How does it work?
Inputs → method → outputs, concretely, framed as the questions a practitioner asks.
"What's the range of outcomes?" — Monte Carlo.
- Input: a
MonteCarloSpec— named variables, each a distribution (normal,lognormal,triangular,uniform,beta, ordiscrete-empirical), plus a closed-formexpressionover those variables, a trial count (1–100,000), and an optionalrandomSeed. - Method: the spoke samples each variable per trial (Box–Muller for normals; gamma-ratio sampling for beta draws, aligned with the voi-calculator donor), evaluates the expression with a hand-rolled parser (
+ - * / ^, parentheses, unary minus, identifiers — noeval), and collects the outcomes. - Output: a
MonteCarloResult—mean,stdev, requestedpercentiles, and the rawdistributionarray.
"Is it worth paying to know more before I decide?" — VOI.
- Input: a discrete
DecisionNodetree (decisions over chance nodes sharing onesharedUncertaintyId), optionally with aDiscreteInformationSignal(per-observation likelihoods given each state). - Method: the tree is flattened to a payoff matrix; baseline EV is the best action under the prior, perfect-information EV picks the best action per resolved state, and
EVPI = perfectInformationEV − baselineEV(floored at 0). With a signal, discreteEVSIis computed Bayesian-style over posterior-weighted optimal actions. - Output: a
VOIResponse—baselineEV,perfectInformationEV,evpi, optionalevsi, and a plain-languagerecommendation.
Science backing. The VOI math is standard decision analysis (EVPI / EVSI). The four PA Instruments are lifted from Douglas Hubbard's How to Measure Anything — interval calibration (Winkler interval score: width + an out-of-bounds penalty scaled by 2/α), precision-weighted Bayesian combination (precision = 1/(SE²+ε), posterior SE = √(1/Σ effective-weight), 90% CI via z = 1.645), and a ten-method measurement catalog each carrying a cost range, an uncertainty-reduction range, and a chapter reference — every catalog row cites its How to Measure Anything chapter in the contract (htmaReference).
Differentiation beat: the practitioner's real question is rarely "what's the average" — it's "should I spend money/time to reduce this uncertainty, or just decide?" VOI answers that with a number (EVPI), and the measurement recommender turns it into an action: it ranks methods by expectedNetVoi = variableEvpi·(typicalReduction/100) − typicalCost and buckets each as worthwhile / marginal / not_worthwhile.
Visual — Tier B (step flow). MonteCarloSpec → seeded sampling → expression eval → { mean · stdev · percentiles · draws }, and in parallel DecisionTree (+ signal) → payoff matrix → { baselineEV · EVPI · EVSI · recommendation }.
4. What does it enable?
Concrete uses a practitioner would recognize:
- Price a promotion or staffing decision — model the candidate-state uncertainty and get EVPI: the most you should rationally spend to resolve it before committing (the seeded example below).
- Range a budget or comp scenario — Monte Carlo a salary-increase or program-cost model and read the P10/P50/P90 band instead of a single guess.
- Decide whether to run the study — feed a variable's EVPI to
measurement-recommendand get a ranked, ROI-bucketed list of measurement methods (survey, experiment, benchmark, expert elicitation, …). - Combine disagreeing estimates — fuse an executive's number, a research estimate, and a historical figure into one posterior with honest error bars (
bayesian-combine), each source's share of the precision weight reported. - Score forecaster calibration — run predicted intervals against realized outcomes (
interval-scoring) to measure coverage and sharpness; this feeds the Leadership Index's Predictive Acuity. - Reuse decision models — register a decision tree once and re-run VOI against it as priors change.
Visual — (TBD — a rendered Monte Carlo distribution with P10/P50/P90 band for one model).
5. How it fits in the toolbox
Data flow:
- Consumes — nothing mandatory from other spokes; the model is supplied by the caller (a decision tree, a distribution spec, a set of estimate sources). It deliberately takes no forced cross-spoke import.
- Emits — typed results (
MonteCarloResult,VOIResponse,IntervalScoreResult,BayesianCombineResult,MeasurementRecommendationResult). Consumers vendorsrc/spokes/forecasting/contracts/types.ts. - Feeds — the PA Products that compose Instruments: the AnyComp decision layer (simulator / scenarios), the Leadership Index (interval-scoring → Predictive Acuity), and the Analytics-Plan Generator (VOI-ranked measurement plan). A Monte Carlo summary can also be wrapped in
calculus'sMetricEnvelopeor used directly bydecision-wizardandvela. - Persistence boundary — write routes require a
tenantId; persisted runs and analyses are tenant-scoped. The stateless PA Instruments take no tenant boundary.
Visual — Tier B (typographic data-flow). caller model → Forecasting (run / VOI / instruments) → { AnyComp simulator · Leadership Index acuity · Analytics-Plan · decision-wizard / vela }.
6. Commercialization / packaging
Forecasting is a service component, not a standalone product. Its endpoints are the PA Instruments layer — composable primitives that buyer-facing Products (AnyComp's decision layer, the Leadership Index, the Analytics-Plan Generator) assemble into something a buyer actually meets.
- Data-license posture: the spoke ships no third-party survey data — it computes over models the caller supplies, and the measurement-catalog rows are the spoke's own structured encoding of a public methodology (How to Measure Anything, cited per row). No vendor-data licensing constraint attaches to the math itself.
- Anything about pricing tiers or packaged offerings is (TBD) — not earned yet, so not stated.
Visual — Tier B (placement line). Instruments (this spoke) → composed by → Products (AnyComp · Leadership Index · Analytics-Plan) → buyer surface.
7. The vision
Decision math as plumbing: any people decision in the portfolio can ask "what's the range, and what is it worth to know more?" and get a reproducible, auditable answer — never a lone spreadsheet number again.
The direction is breadth of composition, not a new UI: more Products wiring the VOI loop and the calibration loop into their flows, and the measurement recommender closing the elicit→measure→re-score cycle. The named planned consumers in the registry are decision-wizard and vela. Anything beyond that is (TBD).
Visual — (TBD — the elicit → measure → re-score calibration loop as a cycle diagram).
8. Current status
Grounded in the real code state (contract 1.4.0, status: "live" in src/lib/contracts/registry.ts, src/spokes/forecasting/):
- Shipped (live): seeded Monte Carlo over six distribution shapes with a closed-form expression parser; discrete decision-tree VOI (EVPI + optional EVSI); register/fetch decision models; and the four PA Instruments — interval scoring, Bayesian precision-combine, the ten-method measurement catalog, and the VOI-ranked measurement recommender. Live routes:
POST /monte-carlo/run,POST /decision-models,GET /decision-models/[id],POST /voi/compute,POST /interval-scoring,POST /bayesian-combine,POST /measurement-recommend,GET /measurement-catalog,GET /health. Eight contract IDs in the registry; MCP module registered; demo seed shipped (drizzle/0016_pat17_forecasting_seed.sql). - In flight / planned: consumer wiring —
decision-wizardandvelaare listedplannedin the registry, not yet live consumers. (The README's "1.1.0" line is stale; the contract source of truth is1.4.0.)
Visual — Tier A (live capture). GET /api/spokes/forecasting/measurement-catalog returns the real ten-method catalog at request time; GET /api/spokes/forecasting/health reports live status.
Worked example (real, computed from the shipped seed)
The shipped seed pat17-seed-promotion-decision (in drizzle/0016_pat17_forecasting_seed.sql) encodes a real decision tree: promote vs no_promotion, over one shared candidate-state uncertainty.
POST /api/spokes/forecasting/voi/compute
{
"tenantId": "demo",
"decisionModelId": "pat17-seed-promotion-decision"
}
The tree's payoffs, by candidate state (each state at probability 0.5):
- promote → state A: 100, state B: 0
- no_promotion → state A: 30, state B: 80
The VOI core (computeVoi in src/spokes/forecasting/core/voi.ts) computes:
- baselineEV — best action under the prior: promote = 0.5·100 + 0.5·0 = 50; no_promotion = 0.5·30 + 0.5·80 = 55 → baseline = 55.
- perfectInformationEV — best action per resolved state: state A best = max(100, 30) = 100; state B best = max(0, 80) = 80 → 0.5·100 + 0.5·80 = 90.
- evpi = 90 − 55 = 35.
So the response is:
→ {
"analysis": {
"baselineEV": 55,
"perfectInformationEV": 90,
"evpi": 35,
"recommendation": "Resolving the modeled uncertainty has strictly positive expected value (EVPI > 0)."
}
}
What a practitioner does with it: EVPI = 35 is the ceiling on what it's worth to resolve the candidate-state uncertainty before deciding. Hand that 35 to measurement-recommend and it ranks methods by expected net VOI — at this EVPI every catalog method's expectedNetVoi (35·typicalReduction/100 − typicalCost) is far negative, so all bucket not_worthwhile: the model is telling you the information is cheap to resolve in theory but no formal study clears its own cost here — just decide (or remodel at realistic payoff scale). The seed file documents this same EVPI = 35; the figure above is recomputed from the tree, not invented.