Forecasting — plain-language explainer

Forecasting turns an uncertain people decision into numbers you can act on: a distribution over the outcome, and a dollar value on what knowing more would be worth.

A People Analytics Toolbox component. Built to the portfolio Explainer Standard v1.0. Every claim below is grounded in the spoke's own code and contracts (src/spokes/forecasting/, contract 1.4.0); anything not yet built is marked (TBD).


1. What is it?

Forecasting is a set of callable decision-math services: give it a model of an uncertain outcome and it returns either a full sampled distribution (Monte Carlo simulation) or the expected value of resolving the uncertainty before you commit (value of information — EVPI and discrete EVSI).

It is deliberately math and persistence only — not a dashboard. A consumer app sends a spec over HTTP (or MCP), the spoke runs the calculation with a reproducible seed, and returns a typed result the consumer renders. Around that core sit four stateless PA Instruments — interval scoring, Bayesian precision-combine, a measurement-method catalog, and a measurement recommender — composable primitives that other Toolbox products assemble into larger decision flows.

Visual — Tier B (typographic capability map). What the spoke exposes:

  • monte-carlo/run — sampled distribution over a closed-form model (mean, stdev, percentiles, raw draws).
  • voi/compute — EVPI / EVSI on a discrete decision tree (what perfect or partial information is worth).
  • decision-models — register / fetch a reusable decision tree.
  • interval-scoring (PA Instrument) — score predicted intervals against actuals (Winkler).
  • bayesian-combine (PA Instrument) — fuse several estimate sources into one posterior.
  • measurement-recommend + measurement-catalog (PA Instruments) — rank measurement methods by value of information.

2. What problem does it solve — and why is it different?

The pain it removes: people decisions — a promotion, a program investment, a comp move, a hiring-pace bet — are made under genuine uncertainty, but the analysis usually collapses to a single point estimate in a spreadsheet, with the uncertainty either hidden or argued about informally.

The difference, stated as a shift:

  • FROM one hand-built spreadsheet number, with the uncertainty in someone's head and no record of how it was produced.
  • TO a sampled distribution (or an explicit dollar value on resolving the uncertainty), produced from a versioned API contract, with a deterministic seed so the same spec always reproduces the same draws and an optional audit row in Postgres.

How it differs from the obvious substitutes:

  • vs. spreadsheet Monte Carlo — those re-roll differently every recalculation, can't be replayed, and live as untracked copies. Forecasting uses a seeded PRNG (Mulberry32), versioned request/response schemas, and writes auditable run rows (monte_carlo_runs, voi_analyses).
  • vs. a full Bayesian / MCMC stack — intentionally narrow and fast: closed-form expression evaluation over named parametric distributions for planning scenarios, and exact discrete VOI on aligned decision trees. Explainable, not a black box.
  • vs. generic BI — BI charts what happened; Forecasting prices what might happen and what it is worth to find out before deciding.

Visual — Tier B (FROM→TO shift block). The shift above is the visual; a rendered comparison page is a follow-up (FU-A).

3. How does it work?

Inputs → method → outputs, concretely, framed as the questions a practitioner asks.

"What's the range of outcomes?" — Monte Carlo.

  • Input: a MonteCarloSpec — named variables, each a distribution (normal, lognormal, triangular, uniform, beta, or discrete-empirical), plus a closed-form expression over those variables, a trial count (1–100,000), and an optional randomSeed.
  • Method: the spoke samples each variable per trial (Box–Muller for normals; gamma-ratio sampling for beta draws, aligned with the voi-calculator donor), evaluates the expression with a hand-rolled parser (+ - * / ^, parentheses, unary minus, identifiers — no eval), and collects the outcomes.
  • Output: a MonteCarloResultmean, stdev, requested percentiles, and the raw distribution array.

"Is it worth paying to know more before I decide?" — VOI.

  • Input: a discrete DecisionNode tree (decisions over chance nodes sharing one sharedUncertaintyId), optionally with a DiscreteInformationSignal (per-observation likelihoods given each state).
  • Method: the tree is flattened to a payoff matrix; baseline EV is the best action under the prior, perfect-information EV picks the best action per resolved state, and EVPI = perfectInformationEV − baselineEV (floored at 0). With a signal, discrete EVSI is computed Bayesian-style over posterior-weighted optimal actions.
  • Output: a VOIResponsebaselineEV, perfectInformationEV, evpi, optional evsi, and a plain-language recommendation.

Science backing. The VOI math is standard decision analysis (EVPI / EVSI). The four PA Instruments are lifted from Douglas Hubbard's How to Measure Anything — interval calibration (Winkler interval score: width + an out-of-bounds penalty scaled by 2/α), precision-weighted Bayesian combination (precision = 1/(SE²+ε), posterior SE = √(1/Σ effective-weight), 90% CI via z = 1.645), and a ten-method measurement catalog each carrying a cost range, an uncertainty-reduction range, and a chapter reference — every catalog row cites its How to Measure Anything chapter in the contract (htmaReference).

Differentiation beat: the practitioner's real question is rarely "what's the average" — it's "should I spend money/time to reduce this uncertainty, or just decide?" VOI answers that with a number (EVPI), and the measurement recommender turns it into an action: it ranks methods by expectedNetVoi = variableEvpi·(typicalReduction/100) − typicalCost and buckets each as worthwhile / marginal / not_worthwhile.

Visual — Tier B (step flow). MonteCarloSpec → seeded sampling → expression eval → { mean · stdev · percentiles · draws }, and in parallel DecisionTree (+ signal) → payoff matrix → { baselineEV · EVPI · EVSI · recommendation }.

4. What does it enable?

Concrete uses a practitioner would recognize:

  • Price a promotion or staffing decision — model the candidate-state uncertainty and get EVPI: the most you should rationally spend to resolve it before committing (the seeded example below).
  • Range a budget or comp scenario — Monte Carlo a salary-increase or program-cost model and read the P10/P50/P90 band instead of a single guess.
  • Decide whether to run the study — feed a variable's EVPI to measurement-recommend and get a ranked, ROI-bucketed list of measurement methods (survey, experiment, benchmark, expert elicitation, …).
  • Combine disagreeing estimates — fuse an executive's number, a research estimate, and a historical figure into one posterior with honest error bars (bayesian-combine), each source's share of the precision weight reported.
  • Score forecaster calibration — run predicted intervals against realized outcomes (interval-scoring) to measure coverage and sharpness; this feeds the Leadership Index's Predictive Acuity.
  • Reuse decision models — register a decision tree once and re-run VOI against it as priors change.

Visual — (TBD — a rendered Monte Carlo distribution with P10/P50/P90 band for one model).

5. How it fits in the toolbox

Data flow:

  • Consumes — nothing mandatory from other spokes; the model is supplied by the caller (a decision tree, a distribution spec, a set of estimate sources). It deliberately takes no forced cross-spoke import.
  • Emits — typed results (MonteCarloResult, VOIResponse, IntervalScoreResult, BayesianCombineResult, MeasurementRecommendationResult). Consumers vendor src/spokes/forecasting/contracts/types.ts.
  • Feeds — the PA Products that compose Instruments: the AnyComp decision layer (simulator / scenarios), the Leadership Index (interval-scoring → Predictive Acuity), and the Analytics-Plan Generator (VOI-ranked measurement plan). A Monte Carlo summary can also be wrapped in calculus's MetricEnvelope or used directly by decision-wizard and vela.
  • Persistence boundary — write routes require a tenantId; persisted runs and analyses are tenant-scoped. The stateless PA Instruments take no tenant boundary.

Visual — Tier B (typographic data-flow). caller model → Forecasting (run / VOI / instruments) → { AnyComp simulator · Leadership Index acuity · Analytics-Plan · decision-wizard / vela }.

6. Commercialization / packaging

Forecasting is a service component, not a standalone product. Its endpoints are the PA Instruments layer — composable primitives that buyer-facing Products (AnyComp's decision layer, the Leadership Index, the Analytics-Plan Generator) assemble into something a buyer actually meets.

  • Data-license posture: the spoke ships no third-party survey data — it computes over models the caller supplies, and the measurement-catalog rows are the spoke's own structured encoding of a public methodology (How to Measure Anything, cited per row). No vendor-data licensing constraint attaches to the math itself.
  • Anything about pricing tiers or packaged offerings is (TBD) — not earned yet, so not stated.

Visual — Tier B (placement line). Instruments (this spoke) → composed by → Products (AnyComp · Leadership Index · Analytics-Plan) → buyer surface.

7. The vision

Decision math as plumbing: any people decision in the portfolio can ask "what's the range, and what is it worth to know more?" and get a reproducible, auditable answer — never a lone spreadsheet number again.

The direction is breadth of composition, not a new UI: more Products wiring the VOI loop and the calibration loop into their flows, and the measurement recommender closing the elicit→measure→re-score cycle. The named planned consumers in the registry are decision-wizard and vela. Anything beyond that is (TBD).

Visual — (TBD — the elicit → measure → re-score calibration loop as a cycle diagram).

8. Current status

Grounded in the real code state (contract 1.4.0, status: "live" in src/lib/contracts/registry.ts, src/spokes/forecasting/):

  • Shipped (live): seeded Monte Carlo over six distribution shapes with a closed-form expression parser; discrete decision-tree VOI (EVPI + optional EVSI); register/fetch decision models; and the four PA Instruments — interval scoring, Bayesian precision-combine, the ten-method measurement catalog, and the VOI-ranked measurement recommender. Live routes: POST /monte-carlo/run, POST /decision-models, GET /decision-models/[id], POST /voi/compute, POST /interval-scoring, POST /bayesian-combine, POST /measurement-recommend, GET /measurement-catalog, GET /health. Eight contract IDs in the registry; MCP module registered; demo seed shipped (drizzle/0016_pat17_forecasting_seed.sql).
  • In flight / planned: consumer wiring — decision-wizard and vela are listed planned in the registry, not yet live consumers. (The README's "1.1.0" line is stale; the contract source of truth is 1.4.0.)

Visual — Tier A (live capture). GET /api/spokes/forecasting/measurement-catalog returns the real ten-method catalog at request time; GET /api/spokes/forecasting/health reports live status.


Worked example (real, computed from the shipped seed)

The shipped seed pat17-seed-promotion-decision (in drizzle/0016_pat17_forecasting_seed.sql) encodes a real decision tree: promote vs no_promotion, over one shared candidate-state uncertainty.

POST /api/spokes/forecasting/voi/compute
{
  "tenantId": "demo",
  "decisionModelId": "pat17-seed-promotion-decision"
}

The tree's payoffs, by candidate state (each state at probability 0.5):

  • promote → state A: 100, state B: 0
  • no_promotion → state A: 30, state B: 80

The VOI core (computeVoi in src/spokes/forecasting/core/voi.ts) computes:

  • baselineEV — best action under the prior: promote = 0.5·100 + 0.5·0 = 50; no_promotion = 0.5·30 + 0.5·80 = 55 → baseline = 55.
  • perfectInformationEV — best action per resolved state: state A best = max(100, 30) = 100; state B best = max(0, 80) = 80 → 0.5·100 + 0.5·80 = 90.
  • evpi = 90 − 55 = 35.

So the response is:

→ {
    "analysis": {
      "baselineEV": 55,
      "perfectInformationEV": 90,
      "evpi": 35,
      "recommendation": "Resolving the modeled uncertainty has strictly positive expected value (EVPI > 0)."
    }
  }

What a practitioner does with it: EVPI = 35 is the ceiling on what it's worth to resolve the candidate-state uncertainty before deciding. Hand that 35 to measurement-recommend and it ranks methods by expected net VOI — at this EVPI every catalog method's expectedNetVoi (35·typicalReduction/100 − typicalCost) is far negative, so all bucket not_worthwhile: the model is telling you the information is cheap to resolve in theory but no formal study clears its own cost here — just decide (or remodel at realistic payoff scale). The seed file documents this same EVPI = 35; the figure above is recomputed from the tree, not invented.