Principia Connector — plain-language explainer

Principia Connector is the toolbox's planned read-only window onto Principia's canonical-measurement catalog — today it is a reserved namespace with a heartbeat, not a live capability.

A People Analytics Toolbox component. Built to the portfolio Explainer Standard v1.0. Every claim below is grounded in the spoke's own code and contracts (src/spokes/principia-connector/, contract 0.0.0, registry status: "coming-soon"); the spoke is deliberately under-built, so most capabilities below are marked (TBD / not yet built) rather than invented.

1. What is it?

Principia Connector is the spoke that will let the rest of the toolbox look up what a metric actually means — its canonical definition, the instrument that measures it, the citations that back it, and the effects it has been shown to drive — by reading from Principia, an external canonical-measurement catalog maintained outside this repo.

It follows the same shape as the toolbox's other catalog connectors (BLS, Census, O*NET, NAICS): a read-mostly cache plus an MCP surface over an upstream authority the toolbox does not own. Principia is described in the spoke's own positioning notes as "the seventh catalog the toolbox consumes."

Today, though, the honest description is narrower: the spoke exists as a reserved namespace. Its contracts/types.ts holds nothing but CONTRACT_VERSION = "0.0.0"; its database schema declares a single heartbeat table; its MCP module registers zero tools; and the only live HTTP route is /health. The connector's real behavior lands in the work item the code calls TB-PRINCIPIA-04, which is gated on the upstream Principia API going live (PRN-014).

Visual — Tier A (live capture). The one thing this spoke really does today is answer its health probe:

GET /api/spokes/principia-connector/health
→ {
    "spoke": "principia-connector",
    "status": "ok",
    "contractVersion": "0.0.0",
    "schemaReachable": true,
    "latencyMs": <measured>,
    "checkedAt": "<timestamp>"
  }

(Shape is real — produced by checkSpokeHealth() against the principia_connector.heartbeat table per src/app/api/spokes/principia-connector/health/route.ts. contractVersion: "0.0.0" is the literal value in the contract file. latencyMs and checkedAt are measured at request time, so they are shown as placeholders rather than invented values.)

2. What problem does it solve — and why is it different?

The pain it is meant to remove: psychometric and outcome metrics scatter across spreadsheets, and absent a disciplined catalog, an AI agent cannot reliably resolve what "Burnout Composite v3" means — or whether it quietly displaced an older canonical variable.

The difference, stated as a shift it intends to make:

FROM a metric name treated as self-evident, with its definition, instrument, and evidence living wherever someone last typed them.
TO a metric name that resolves to a governed canonical variable with a declared definition, a measuring instrument, citation-backed effects, and an explicit trail when it gets renamed or split.

How it is meant to differ from the obvious substitutes:

vs. hand-maintaining a metric dictionary — the connector points at an external authority (Principia) that owns the canonical truth, so the toolbox doesn't fork-and-drift its own copy of "what engagement means."
vs. generic BI metric layers — those define metrics by SQL formula, not by scientific provenance. The connector's unit is a cited construct, with redirected_from lineage when a variable is renamed.

Honest caveat: none of this is callable yet. The differentiation above describes the spoke's stated purpose, not shipped behavior.

Visual — Tier B (FROM→TO typographic block). The shift above is the visual; a rendered comparison block is deferred until the cache and contract types actually land (it would otherwise illustrate behavior that does not exist).

3. How does it work?

The mechanism described here is the planned design recorded in the spoke's README and SURFACE notes; the only part that runs today is the health probe.

Inputs → method → outputs, as designed:

Input (planned): a lookup against Principia's canonical-measurement graph — a construct/variable id, an instrument id, a survey item, a citation, or an effect query.
Method (planned): a read-mostly cache with declared freshness. The principia_connector schema would gain cache tables for constructs, instruments, items, citations, and effects, at a default 24-hour TTL, with a cache-bust webhook honored for usage_restrictions and validation_status changes (the README flags this as load-bearing for correctness). Writes never flow through this connector — they stay inside Principia's curator UI to preserve provenance.
Output (planned): vendored Principia contract types (CanonicalVariable / Measure / Citation / Instrument / Item / Effect / Evidence, sourced from @measurement/core) returned over MCP tools such as principia.constructs.{list,lookup,search,resolve}, principia.instruments.*, principia.items.*, principia.citations.*, principia.effects.search, and principia.evidence.for.
Output (today): a single SpokeHealth envelope. Nothing else.

Differentiation beat: the design's distinctive discipline is resolver-redirect tolerance — when Principia bumps a canonical variable (e.g., splits engagement into vigor + dedication + absorption), the connector is specified to treat redirects as warnings, not errors, discovering renames via alternative_names[] and redirected_from. That mirrors the calculus metricKey soft-validation pattern already in the toolbox.

Science backing: the data the connector serves is Principia's, not the toolbox's — citation-backed constructs and effects curated upstream. The toolbox's contribution is the connector discipline (cache freshness, audit logging, redirect tolerance), not the measurement science.

Visual — Tier B (planned step flow). Principia canonical-measurement catalog → (24h-TTL cache + cache-bust webhook) → principia.* MCP tools → calculus / metrics-catalog consumers. Marked planned; only the rightmost consumers and the toolbox-side MCP gateway exist today.

4. What does it enable?

All but the first of these are (TBD / not yet built) — they activate when TB-PRINCIPIA-04 ships against a live Principia API.

Health-gate a dependency in CI (works today) — poll GET /api/spokes/principia-connector/health until the upstream integration flips green.
Resolve a survey item to its citations (TBD) — before publishing a longitudinal brief, look up the item's Principia citations so the brief is cite-backed.
Align metricKey with Principia IDs (TBD) — let calculus owners reconcile the toolbox's metric keys against canonical variable ids.
Ground an AI agent in cited effects (TBD) — let an MCP agent query principia.evidence.for so its claims are anchored to evidence rather than improvised.
Tolerate canonical renames (TBD) — warn agents when Principia emits redirected_from metadata instead of failing hard, so a renamed construct doesn't break a downstream pipeline.

Visual — Tier D (TBD — a rendered "construct lookup → cited effects" capture, once the cache serves real rows).

5. How it fits in the toolbox

Data flow, as registered and designed:

Consumes (planned) — Principia's external canonical-measurement catalog, via the upstream Principia MCP/API once PRN-014 makes it callable. Auth would be inbound TOOLBOX_MCP_KEY_PRINCIPIA and outbound per Principia's own MCP auth (the outbound shape is itself (TBD)).
Emits (planned) — vendored Principia contract types over principia.* MCP tools. The contract is 0.0.0 and intentionally empty until those types are vendored from @measurement/core; re-vendoring is required on every upstream CONTRACT_VERSION bump.
Feeds (planned) — calculus and metrics-catalog are the named planned consumers (registry consumers: [calculus (planned), performix (planned)]; SURFACE notes name metrics-catalog + calculus). They would compose Principia lookups via HTTP + MCP only — never by cross-importing measurement internals.
Reports (planned) — once the cache is hot, register data-source.principia.canonical-store with integrationStatus: "connected" into the PAT-45 Data Source Catalog.

Scope boundary (real, from SURFACE.md): this is not a scientific-publishing stack and cannot author Principia truths; causal field experiments live in program-evaluation; interim consumers should lean on metrics-catalog + calculus rather than block on this connector.

Visual — Tier B (typographic data-flow). Principia (external authority) → principia-connector (read cache + MCP façade) → { calculus · metrics-catalog }. Solid where it exists (the consumers, the gateway), dashed where it is planned (everything through the connector itself).

6. Commercialization / packaging

Principia Connector is infrastructure, not a sold product — it is a catalog connector that grounds other components, in exactly the same posture as the BLS / Census / O*NET / NAICS connectors. It sits behind buyer-facing capabilities (cite-backed metrics inside calculus and the metrics catalog) rather than being packaged on its own.

Data-license posture: the canonical-measurement content belongs to Principia, an external authority. The connector is explicitly read-mostly — writes stay in Principia's curator UI to preserve provenance — and the cache must honor usage_restrictions changes via cache-bust. Any redistribution terms are Principia's to set; the connector's job is to respect them, not to relicense them.
Pricing / tier placement: (TBD) — not earned, and not stated. The spoke is pre-capability.

Visual — Tier D (TBD — connector-family placement diagram showing Principia alongside the other catalog connectors).

7. The vision

Cite-backed canonical metrics available anywhere in the toolbox — every metric name resolving to a governed construct, its instrument, and the evidence behind it, with renames handled as warnings rather than breakage.

The connector pattern is the load-bearing reuse story: Principia becomes the seventh external catalog the toolbox consumes through a uniform read-cache-plus-MCP façade, so AI agents and analysts grounding work in the toolbox can reach measurement science without the toolbox owning or forking it. The near-term destination is TB-PRINCIPIA-04 — vendor the contract types, stand up the cache with declared freshness, turn on the principia.* MCP surface, log every call, and flip the registry status from coming-soon to live.

8. Current status

Grounded in the real code state (contract 0.0.0, registry status: "coming-soon", src/spokes/principia-connector/):

Shipped: the reserved namespace and wiring — principia_connector schema with a heartbeat table (migration 0055), the GET /api/spokes/principia-connector/health route, registry enrolment as coming-soon, MCP scaffolding files (handlers.ts, register.ts, tools.ts, tool-descriptions.ts, index.ts) that register zero tools, a single smoke test, and a Spoke I/O illustration. The discovery list returns [] on purpose, so an AI consumer sees the spoke advertise itself but nothing to call.
Not yet built (TBD), gated on upstream PRN-014 then TB-PRINCIPIA-04: vendored contract types (contract still 0.0.0), the cache tables and TTL/cache-bust logic, the full principia.* MCP tool surface, the outbound auth to Principia, the PAT-45 catalog integration-status flip, and the coming-soon → live promotion.

Visual — Tier A (live capture). GET /api/spokes/principia-connector/health returns the real SpokeHealth envelope shown in §1; that probe is the entirety of the spoke's live behavior today.

Worked example (real, end to end): the only thing this spoke does today is answer its health probe. A request to GET /api/spokes/principia-connector/health runs checkSpokeHealth({ slug: "principia-connector", schemaName: "principia_connector", contractVersion: "0.0.0" }), which issues SELECT count(*) FROM "principia_connector"."heartbeat"; the migration-seeded row keeps the count non-zero, so the probe returns { status: "ok", contractVersion: "0.0.0", schemaReachable: true } with a measured latencyMs and checkedAt. That is the honest extent of the live capability — every measurement-lookup behavior described above is planned, not running. No construct, citation, or effect value is invented here because none can yet be served.