Spoke

program-evaluation

Causal experiment registry — deterministic arm assignments, SHA-256 pre-registration digests, lift writes.

Character

You own evidence standards for multimillion-dollar programs—finance demands counterfactual discipline instead of postcard surveys.

Problem

External. Program owners retrospectively cherry-pick anecdotes; spreadsheets hold the lone copy of assignments—unverifiable.

Internal. Randomisation narratives fail procurement reviews lacking tamper-evident artefacts.

Philosophical. Serious diversity, learning, wellbeing, and policy pilots deserve registry-grade experimentation hygiene—not vibes-only narratives.

Guide

Operators define experiments referencing segmentation cohort ids opaquely, record immutable registrations with cryptographic digests, resolve principals into weighted arms using deterministic RNG streams strengthened by organisational peppers (PROGRAM_EVAL_RANDOMIZATION_PEPPER), persist lift payloads for audit exports, exposing typed HTTP/MCP equivalents with PAT-11 service-key discipline analogous to hardened product experimentation stacks—without importing sibling spoke DB layers. program-evaluation anchors narrative claims to reproducibility infrastructure.

Abstract

Background. People interventions increasingly face CFO + legal scrutiny analogous to regulated trials despite operating without clinical infrastructure—calling for transparent assignments + lift artefacts.

Methodology. PAT-D3-B models experiments with hypothesis metadata, deterministic assignment surfaces backed by seeded RNG reproducibility augmented via HMAC-style peppers, persisted lift computations referencing cohort handles resolved elsewhere, plus MCP + HTTP symmetrical contracts guarding writes identically via service keys—mirroring disciplined production experimentation references cited in README.

Scope. Orchestrates causal plumbing—not end-to-end survey deployment; ingestion partners remain segmentation-studio/toolbox.etl etc.

Contribution. Enables automated agents verifying digests programmatically unlike slide-only registries historically tolerated in HR.

Evidence / Provenance. PAT-D3-B session report archived under docs/session-reports/ plus scaffold script scripts/seed-program-evaluation-demo.ts.

Plan

  1. 01

    Author experiments

    Persist hypothesis metadata + audience filters referencing segmentation handles without violating cross-schema imports—POST via service-key policy.

  2. 02

    Pre-register bundles

    Freeze specification digests hashed for tamper evidence prior to enrolling subjects.

  3. 03

    Resolve assignments deterministically

    Call assignment endpoints logging seeds reproducibly downstream for dispute resolution or regulator inspection.

  4. 04

    Export lift responsibly

    Pair lift outputs through calculus enrichment + anonymizer gating before publishing sensitive segment breakdowns.

Call to Action

Direct. Run demo seed (scripts/seed-program-evaluation-demo.ts) then exercise curls locally.

Transitional. Review PAT-D3 scout + session notes outlining threat modelling for peppers vs replay risks.

Spoke I/O (visual language v1)

Every toolbox spoke shares the same abstract choreography: typed inputs on the left, distilled verbs in the center, typed outputs on the right, and (when relevant) cross-spoke HTTP composition along the bottom rail. Source package: @people-analytics-toolbox/spoke-illustrations.

Program evaluationINPUTSMAIN ACTIONSOUTPUTSExperiment definitionArmSpec + KPIsSubject ledgerEnrollmentRowsAssign deterministic armsEstimate lifts + cohort matchesTreatment lift readoutsLiftEstimatePackMatched cohortsBalanceDiagnosticsCOMPOSES WITHpreference-modelersegmentation-studio

Try it now

Copy this curl. Paste in any terminal. Public read — no auth needed.

program-evaluation.health

GET

POST experiment routes remain service-key gated; heartbeat confirms schema provisioning before scripted demos.

curl -sS "https://people-analytics-toolbox.vercel.app/api/spokes/program-evaluation/health"

Vendor the contract

The Zod contract is the source of truth. Vendor a copy into your consumer app — you keep it; we don't break it underneath you. Re-vendor when the version bumps.

// Vendor canonical types:
// src/spokes/program-evaluation/contracts/types.ts

Source path: src/spokes/program-evaluation/contracts/types.ts · GitHub

Failure

Procurement torches discretionary budgets—you cannot attest how participants entered treatment arms versus control beyond anecdote.

Success

Registry-grade dossiers unify finance, ERG leads, auditors around reproducible arm assignments plus lift artefacts tied to cryptographic digests.