Spoke
segmentation-studio
stitching, done right — canonical fields, identity resolution, versioned packs.
Character
Problem
External. Every HRIS uses different column names. Workday calls it Employee Status; ADP calls it Active/Inactive; SuccessFactors uses Employment Status. Multi-source joins are heroic Excel exercises. Segment definitions drift across teams — "engineering and west" means different things to different people.
Internal. You're the one who knows the columns. That knowledge isn't in any system; it's in your head, and it walks out the door if you leave.
Philosophical. Canonical-field normalization is a platform problem, not an analyst problem. It should be solved once, versioned, and consumed by every downstream tool.
Guide
Config_Segmentation parity), and versions schemas so cohort logic is immutable like code. Demographic-ratio segments plus six PAT-156 canonical enrichment fields now ship beside the flagship 35-field catalog. MCP + HTTP parity keeps agents and analysts on identical contracts.segmentation-studio — HRIS canonical-field normalization plus multi-membership segmentation. Cross-source joins carry column lineage plus conflict disclosures; hierarchical canonical segments expose stable ids; segmentation packs remain vendorable artefacts for downstream systems.
Abstract
Background. HRIS schemas diverge by vendor release; segmentation rot is inevitable unless canonical keys, membership algebra, and auditable versioning live in substrate services analysts do not rewrite every quarter.
Methodology. A scored priority catalogue maps headers to enumerated canonical fields. Multi-membership segmentation evaluates composable predicates over materialised nodes. Supplemental connectors persist tenant datasets; PAT-156 extends the global field registry with cohort-comparison constructs; declarative rules mirror spreadsheet engines for repeatability; SegmentationSchemaVersion rows provide branch/diff/snapshot semantics.
Scope. Resolves segments only—does not price compensation (anycomp) or prove causality (program-evaluation). Calculated psychological segments progressively gain resolver parity per PAT-63 follow-ons.
Contribution. Stateless HTTP + MCP share Zod contracts; semver-like schema tooling gives analytics engineers the governance language already normal for APIs.
Evidence / Provenance. Donor lift plus PAT-catalogued migrations (e.g., drizzle/0087_pat156_*, demographic-ratio seeds) recorded in CHANGELOG and README.
Plan
- 01
Ingest + sync HRIS rows
Use
segmentation-studio.hris-ingestor PAT-65 Workday / Bamboo sync routes with tenant context + service keys; responses include run ids for audit trails. - 02
Resolve identities + joins
identity-resolveclusters sources;data-join/runmerges supplements withOVERWRITE / IGNORE / FILL_HOLESpolicies and overlap reports. - 03
Author declarative + classic segments
Manage PAT-160 declarative rules via
declarative-segmentation/*APIs and continue to callsegments-define/cohorts-resolvefor multi-membership cohorts. - 04
Pin immutable definitions
Branch or snapshot
schemas/*versions and/or publish packs so finance, TA, and comp quote the same segmentation moment.
Call to Action
Direct. Try the API. Ingest a sample HRIS file free.
Transitional. Read the canonical-field methodology (Phase 3). Hit the catalog inline.
Spoke I/O (visual language v1)
Every toolbox spoke shares the same abstract choreography: typed inputs on the left, distilled verbs in the center, typed outputs on the right, and (when relevant) cross-spoke HTTP composition along the bottom rail. Source package: @people-analytics-toolbox/spoke-illustrations.
Try it now
Copy this curl. Paste in any terminal. POST endpoint — set TOOLBOX_SERVICE_KEY in your shell first.
segmentation-studio.cohorts-resolve
POSTSERVICE KEY REQUIRED
Resolve a multi-membership cohort from criteria. Returns memberIds + segmentNodeIds for downstream joins.
curl -sS -X POST "https://people-analytics-toolbox.vercel.app/api/spokes/segmentation-studio/cohorts/resolve" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $TOOLBOX_SERVICE_KEY" \
-d '{
"tenantId": "demo",
"criteria": {
"any": [
{ "field": "function", "op": "eq", "value": "engineering" }
]
}
}'Vendor the contract
The Zod contract is the source of truth. Vendor a copy into your consumer app — you keep it; we don't break it underneath you. Re-vendor when the version bumps.
// In your consumer app:
import { z } from "zod";
// Vendor a copy of these contracts from the toolbox repo at:
// src/spokes/segmentation-studio/contracts/types.ts
import {
CanonicalFieldsListResponseSchema,
HrisIngestRequestSchema,
HrisIngestResponseSchema,
ResolveCohortRequestSchema,
ResolveCohortResponseSchema,
PublishPackResponseSchema,
CONTRACT_VERSION,
} from "./vendored/segmentation-studio/types";
// Then call the toolbox over HTTP or MCP.
// See docs/EXTERNAL-CONSUMERS.md for onboarding.Source path: src/spokes/segmentation-studio/contracts/types.ts · GitHub
Failure
Demographic-ratio narratives ship without numerator discipline; spreadsheets return because Postgres-backed Config_Segmentation parity was never trusted. Schema drift between People Analytics and finance persists when versions live in Slack screenshots.
Success
Composable cohort logic with mathematically reproducible versioning—engineering, TA, finance, and comp share the same labelled slice instead of arguing over parallel dashboards.