Wage Benchmark — plain-language explainer

Wage Benchmark tells you what an hour of a given job is worth in a given place — and is honest about how sure it is.

A People Analytics Toolbox component. Built to the portfolio Explainer Standard v1.0. Every claim below is grounded in the spoke's own code and contracts (src/spokes/wage-benchmark/, contract 0.3.0); anything not yet built is marked (TBD).

1. What is it?

Wage Benchmark is a market-pay lookup: give it a job (as a SOC occupation code) and a place (national, a state, or a metro area), and it returns the hourly market wage — a median, a percentile band, and a confidence interval — drawn from U.S. Bureau of Labor Statistics survey data.

What makes it more than a lookup is that every answer carries its own honesty flags: whether the number was directly observed for that exact job-and-place cell, or projected from a broader anchor, and how wide the uncertainty is as a result.

Visual — Tier A (live API capture). A real call to the running spoke:

GET /api/spokes/wage-benchmark/benchmark?soc=11-2021&level=national
→ {
    "title": "Marketing Managers",
    "medianHourly": 77.42,
    "percentilesHourly": { "p10": 39.38, "p25": 53.47, "p75": 101.48, "p90": null },
    "confidenceInterval": { "low": 74.32, "high": 80.52, "level": 0.8 },
    "basis": "oews-national-observed",
    "confidence": "observed",
    "observedForCell": true,
    "sources": [{ "name": "BLS OEWS (national)", "kind": "government", "asOf": "2024" }]
  }

(Real output of estimateHourlyBenchmark, BLS OEWS 2024 national file.)

2. What problem does it solve — and why is it different?

The pain it removes: most market-pay numbers arrive as a single confident figure with no provenance. You can't tell whether "$77/hr for this role in Tucson" is a real local observation or a national average someone quietly stretched to fit.

The difference, stated as a shift:

FROM one point number, same-looking everywhere, with the uncertainty hidden.
TO an observed number where the survey actually measured that cell, an explicitly projected number where it didn't, and a band that widens honestly when the evidence is thin.

How it differs from the obvious substitutes:

vs. doing it by hand in the OEWS spreadsheets — the raw BLS files are 100k+ rows across national, state, metro, and industry cuts with different shapes; Wage Benchmark resolves the right cell, converts annual↔hourly on the BLS standard-hours convention, and falls back deterministically when a cell is missing.
vs. a generic salary site — those rarely tell you which number is real. Wage Benchmark refuses to launder a projection as an observation; the basis and confidence fields are first-class.

Visual — Tier B (FROM→TO typographic block). The shift above is the visual; a rendered comparison block is a follow-up (FU-A).

3. How does it work?

Inputs → method → outputs, concretely:

Input: a SOC occupation code + a geo selector ({ level: "national" | "state" | "metro", code? }).
Method — observed-first ladder. The estimator tries the most credible source available and records which rung it landed on:
1. Direct cell observation — BLS measured this exact SOC × place. Tightest band. (oews-metro-observed, oews-state-observed, oews-national-observed.)
2. Cost-of-labor projection — no per-SOC local observation, so it takes the national anchor for that SOC and scales it by the place's observed all-occupation cost-of-labor index. Moderately wide band. (oews-geo-index-projected.)
3. National anchor, widened — only when no area index exists at all. (oews-national-projected.)
Output: median hourly + p10/p25/p75/p90 + an 80% confidence interval, plus the honesty fields basis, confidence (observed | modeled), observedForCell (boolean), and cited sources.

Differentiation beat: the practitioner's real question isn't "what's the number" — it's "can I defend this number in a pay conversation?" The basis ladder answers that directly: an observed cell is survey-grade; a modeled cell is labelled as an estimate with its method spelled out in the notes.

Visual — Tier A (live capture of the projection path). When the exact cell isn't observed:

GET /api/spokes/wage-benchmark/benchmark?soc=11-1000&level=state&code=MS
→ medianHourly: 39.84, basis: "oews-geo-index-projected", confidence: "modeled",
  notes: ["No direct per-SOC observation for this state cell; projecting the
           national anchor by Mississippi's observed all-occupation cost-of-labor
           index (0.79×)."]

(Real output — the spoke tells you, in the response, that this is a projection and why.)

4. What does it enable?

Concrete uses a practitioner would recognize:

Sanity-check an offer against the local market floor for the role, with a defensible band rather than a single number.
Geo-differentiate pay — see how the same job prices across metros, and whether the difference is observed or merely cost-of-labor-projected.
Feed a pay model — supply the market-anchor leg to a fuller compensation model (it is the sibling of wage-compliance: market pricing vs. the legal floor).
Range-build with nested context — a metro answer arrives already wrapped in its containing state and national ranges, so a recruiter sees the local cell inside its broader market in one request.
Flag thin evidence — surface the roles/places where the data is projected, not observed, so analysts know where to widen ranges or seek a second source.

Visual — (TBD — a rendered metro→state→national nested-range chart for one role).

5. How it fits in the toolbox

Data flow:

Consumes — BLS OEWS 2024 (national, state, metro hourly percentiles) and an observed all-occupation cost-of-labor index per area. Jobs enter as SOC codes, which is how job-family-agent / JobFrame address occupations, so a JobFrame family resolves to a SOC and then to a market wage.
Emits — a BenchmarkEstimate contract (median + percentiles + CI + basis/confidence/observed flags + sources). Consumers vendor src/spokes/wage-benchmark/contracts/types.ts.
Feeds — the public JobBrief / job profiles (observed local pay on a role page) and compensation surfaces that need a market anchor.
Sibling — wage-compliance answers "what's the legal minimum here"; Wage Benchmark answers "what's the market paying here." Same SOC × geography spine, different question.

Visual — Tier B (typographic data-flow). SOC / JobFrame family → Wage Benchmark → { JobBrief profile pay · compensation anchor }, with BLS OEWS + cost-of-labor index as the upstream sources.

6. Commercialization / packaging

Wage Benchmark is a service component, not a standalone product — it is one of the market-data primitives a compensation offering composes. It sits behind buyer-facing surfaces (job profiles, pay tooling) rather than being sold on its own.

Data-license posture: the wage numbers derive from public BLS OEWS data (U.S. government, freely usable), which is what lets the observed figures be shown openly. Vendor-survey data (with its licensing constraints) is a separate concern handled elsewhere in the comp stack.
Anything about pricing tiers or packaged offerings is (TBD) — not earned yet, so not stated.

Visual — (TBD — product-tier placement diagram).

7. The vision

A market-wage answer for any job in any place, where the error bars are always honest and always shrinking — widest first, narrowed wherever real observation can replace projection.

The roadmap is a deliberate widest-error-bars-first widening: the projected cells (where today the answer is a cost-of-labor projection) are the queue, and each release converts more of them to direct observation or a better-modeled estimate. R5 (scope-aware nested ranges) and R7 (the observed geographic-drivers endpoint exposing why a place is more expensive — BLS index, BEA regional price parity, population, density) have shipped. R6 — estimated edges via price-parity extrapolation — is the next modeled link and is deliberately held until the cost-of-living↔labor-cost relationship is modeled rather than assumed 1:1.

Visual — (TBD — the widening roadmap as an observed-vs-projected coverage map).

8. Current status

Grounded in the real code state (contract 0.3.0, src/spokes/wage-benchmark/):

Shipped: national observed anchor (1,465 SOCs, BLS OEWS 2024); observed sub-national for 37 SOCs across 1,820 state and 11,670 metro cells; cost-of-labor index for 444 areas; the observed-first basis ladder with honesty flags; scope-aware nested ranges (R5); the SOC-independent geographic-drivers endpoint (R7). Live routes: GET /benchmark, GET /drivers, GET /coverage, GET /health. MCP tools registered.
In flight / planned: R6 estimated edges via price-parity extrapolation; broader observed sub-national coverage beyond 37 SOCs (the widening queue); a fuller market-range UI on the consuming surfaces.

Visual — Tier A (live capture). GET /api/spokes/wage-benchmark/coverage and /health report the real shipped coverage at request time.

Worked example used above is real output of the spoke's estimateHourlyBenchmark core against the in-repo BLS OEWS 2024 seed: Marketing Managers (SOC 11-2021) national $77.42/hr observed → Arizona $65.35/hr observed (wrapped in its national container) → and, for a role with no local observation, Top Executives (11-1000) in Mississippi $39.84/hr, explicitly projected at 0.79× the national anchor. No figure here is invented.