Spoke

job-family-agent

~1,016 SOC (O*NET v28.3) occupations + 23 families + 26 functions + heuristic classify API.

Character

You normalise chaotic job titles across HRIS extracts before compensation, mobility, compliance, or planning analytics can safely join.

Problem

External. SOC crosswalk spreadsheets multiply; vendors charge rent for taxonomy data that should ride on open licences.

Internal. You inherit “Lead Engineer IV” chaos with zero confidence about SOC lineage.

Philosophical. Canonical labour taxonomy belongs in openly vendored contracts—not locked PDFs attached to SaaS bundles.

Guide

JSON-backed registries load at module init; heuristic classification token-overlaps free text onto ranked SOC / family / function hypotheses with calibrated confidence—all behind IP rate limits on classify POST while catalog GETs remain public reads. job-family-agent folded from meta-factory per PAT-71 with MIT-friendly data packs + LICENSE sidecars.

Abstract

Background. Job harmonisation gates comp (anycomp), workforce planning (Conductor), and compliance — yet spreadsheets remain default.

Methodology. Deterministic loaders expose list/lookup routes; heuristic classifier scores lexical overlap versus canonical alias maps; writes intentionally absent besides audit logs.

Scope. Token-overlap v1—not deep embeddings; inference of pay bands or levels stays upstream.

Contribution. Zod contracts with MCP symmetry; predictable latency for autonomous agents citing SOC evidence.

Evidence / Provenance. O*NET CC BY JSON bundles under src/spokes/job-family-agent/data/.

Plan

  1. 01

    Browse catalogs

    Enumerate occupations, families, and functions via public GET endpoints for discovery UIs or agents.

  2. 02

    Classify free text

    POST /api/spokes/job-family-agent/classify; honour per-IP rate limits (100 rpm) or wrap behind your gateway.

  3. 03

    Vendor contracts

    Pin CONTRACT_VERSION in consumer repos identical to toolbox exports.

  4. 04

    Compose downstream

    Feed SOC outputs into anycomp modelling, wage-compliance lookups, workforce-planning reconciliation.

Call to Action

Direct. curl classify with a messy title string today.

Transitional. Read PAT-71 README for JSON provenance tables.

Spoke I/O (visual language v1)

Every toolbox spoke shares the same abstract choreography: typed inputs on the left, distilled verbs in the center, typed outputs on the right, and (when relevant) cross-spoke HTTP composition along the bottom rail. Source package: @people-analytics-toolbox/spoke-illustrations.

Job family agentINPUTSMAIN ACTIONSOUTPUTSTenant job descriptionsJdTextPackSOC / O*NET corpusOnetSocTableAlias + taxonomy rulesTitleAliasPackEmbed + match taxonomyEmit family / function / levelCanonical job spine rowJobSpineClassificationSOC + confidenceCOMPOSES WITHanycompwage-compliance

Try it now

Copy this curl. Paste in any terminal. Public read — no auth needed.

job-family-agent.families.list

GET

Public enumeration of canonical job families (SOC-aligned).

curl -sS "https://people-analytics-toolbox.vercel.app/api/spokes/job-family-agent/families/list"

Vendor the contract

The Zod contract is the source of truth. Vendor a copy into your consumer app — you keep it; we don't break it underneath you. Re-vendor when the version bumps.

// Vendor canonical types:
// src/spokes/job-family-agent/contracts/types.ts

Source path: src/spokes/job-family-agent/contracts/types.ts · GitHub

Failure

Every downstream spoke invents incompatible job ladders; comp benchmarking becomes opinion, not ontology.

Success

One SOC-aligned spine shared across toolbox consumers—confidence scores travel with canonical ids for reproducible narratives.