Wage Benchmark — plain-language explainer
Wage Benchmark tells you what an hour of a given job is worth in a given place — and is honest about how sure it is.
A People Analytics Toolbox component. Built to the portfolio Explainer Standard v1.0. Every claim below is grounded in the spoke's own code and contracts (src/spokes/wage-benchmark/, contract 0.3.0); anything not yet built is marked (TBD).
1. What is it?
Wage Benchmark is a market-pay lookup: give it a job (as a SOC occupation code) and a place (national, a state, or a metro area), and it returns the hourly market wage — a median, a percentile band, and a confidence interval — drawn from U.S. Bureau of Labor Statistics survey data.
What makes it more than a lookup is that every answer carries its own honesty flags: whether the number was directly observed for that exact job-and-place cell, or projected from a broader anchor, and how wide the uncertainty is as a result.
Visual — Tier A (live API capture). A real call to the running spoke:
GET /api/spokes/wage-benchmark/benchmark?soc=11-2021&level=national
→ {
"title": "Marketing Managers",
"medianHourly": 77.42,
"percentilesHourly": { "p10": 39.38, "p25": 53.47, "p75": 101.48, "p90": null },
"confidenceInterval": { "low": 74.32, "high": 80.52, "level": 0.8 },
"basis": "oews-national-observed",
"confidence": "observed",
"observedForCell": true,
"sources": [{ "name": "BLS OEWS (national)", "kind": "government", "asOf": "2024" }]
}
(Real output of estimateHourlyBenchmark, BLS OEWS 2024 national file.)
2. What problem does it solve — and why is it different?
The pain it removes: most market-pay numbers arrive as a single confident figure with no provenance. You can't tell whether "$77/hr for this role in Tucson" is a real local observation or a national average someone quietly stretched to fit.
The difference, stated as a shift:
- FROM one point number, same-looking everywhere, with the uncertainty hidden.
- TO an observed number where the survey actually measured that cell, an explicitly projected number where it didn't, and a band that widens honestly when the evidence is thin.
How it differs from the obvious substitutes:
- vs. doing it by hand in the OEWS spreadsheets — the raw BLS files are 100k+ rows across national, state, metro, and industry cuts with different shapes; Wage Benchmark resolves the right cell, converts annual↔hourly on the BLS standard-hours convention, and falls back deterministically when a cell is missing.
- vs. a generic salary site — those rarely tell you which number is real. Wage Benchmark refuses to launder a projection as an observation; the
basisandconfidencefields are first-class.
Visual — Tier B (FROM→TO typographic block). The shift above is the visual; a rendered comparison block is a follow-up (FU-A).
3. How does it work?
Inputs → method → outputs, concretely:
- Input: a SOC occupation code + a geo selector (
{ level: "national" | "state" | "metro", code? }). - Method — observed-first ladder. The estimator tries the most credible source available and records which rung it landed on:
- Direct cell observation — BLS measured this exact SOC × place. Tightest band. (
oews-metro-observed,oews-state-observed,oews-national-observed.) - Cost-of-labor projection — no per-SOC local observation, so it takes the national anchor for that SOC and scales it by the place's observed all-occupation cost-of-labor index. Moderately wide band. (
oews-geo-index-projected.) - National anchor, widened — only when no area index exists at all. (
oews-national-projected.)
- Direct cell observation — BLS measured this exact SOC × place. Tightest band. (
- Output: median hourly + p10/p25/p75/p90 + an 80% confidence interval, plus the honesty fields
basis,confidence(observed|modeled),observedForCell(boolean), and citedsources.
Differentiation beat: the practitioner's real question isn't "what's the number" — it's "can I defend this number in a pay conversation?" The basis ladder answers that directly: an observed cell is survey-grade; a modeled cell is labelled as an estimate with its method spelled out in the notes.
Visual — Tier A (live capture of the projection path). When the exact cell isn't observed:
GET /api/spokes/wage-benchmark/benchmark?soc=11-1000&level=state&code=MS
→ medianHourly: 39.84, basis: "oews-geo-index-projected", confidence: "modeled",
notes: ["No direct per-SOC observation for this state cell; projecting the
national anchor by Mississippi's observed all-occupation cost-of-labor
index (0.79×)."]
(Real output — the spoke tells you, in the response, that this is a projection and why.)
4. What does it enable?
Concrete uses a practitioner would recognize:
- Sanity-check an offer against the local market floor for the role, with a defensible band rather than a single number.
- Geo-differentiate pay — see how the same job prices across metros, and whether the difference is observed or merely cost-of-labor-projected.
- Feed a pay model — supply the market-anchor leg to a fuller compensation model (it is the sibling of
wage-compliance: market pricing vs. the legal floor). - Range-build with nested context — a metro answer arrives already wrapped in its containing state and national ranges, so a recruiter sees the local cell inside its broader market in one request.
- Flag thin evidence — surface the roles/places where the data is projected, not observed, so analysts know where to widen ranges or seek a second source.
Visual — (TBD — a rendered metro→state→national nested-range chart for one role).
5. How it fits in the toolbox
Data flow:
- Consumes — BLS OEWS 2024 (national, state, metro hourly percentiles) and an observed all-occupation cost-of-labor index per area. Jobs enter as SOC codes, which is how
job-family-agent/ JobFrame address occupations, so a JobFrame family resolves to a SOC and then to a market wage. - Emits — a
BenchmarkEstimatecontract (median + percentiles + CI + basis/confidence/observed flags + sources). Consumers vendorsrc/spokes/wage-benchmark/contracts/types.ts. - Feeds — the public JobBrief / job profiles (observed local pay on a role page) and compensation surfaces that need a market anchor.
- Sibling —
wage-complianceanswers "what's the legal minimum here"; Wage Benchmark answers "what's the market paying here." Same SOC × geography spine, different question.
Visual — Tier B (typographic data-flow). SOC / JobFrame family → Wage Benchmark → { JobBrief profile pay · compensation anchor }, with BLS OEWS + cost-of-labor index as the upstream sources.
6. Commercialization / packaging
Wage Benchmark is a service component, not a standalone product — it is one of the market-data primitives a compensation offering composes. It sits behind buyer-facing surfaces (job profiles, pay tooling) rather than being sold on its own.
- Data-license posture: the wage numbers derive from public BLS OEWS data (U.S. government, freely usable), which is what lets the observed figures be shown openly. Vendor-survey data (with its licensing constraints) is a separate concern handled elsewhere in the comp stack.
- Anything about pricing tiers or packaged offerings is (TBD) — not earned yet, so not stated.
Visual — (TBD — product-tier placement diagram).
7. The vision
A market-wage answer for any job in any place, where the error bars are always honest and always shrinking — widest first, narrowed wherever real observation can replace projection.
The roadmap is a deliberate widest-error-bars-first widening: the projected cells (where today the answer is a cost-of-labor projection) are the queue, and each release converts more of them to direct observation or a better-modeled estimate. R5 (scope-aware nested ranges) and R7 (the observed geographic-drivers endpoint exposing why a place is more expensive — BLS index, BEA regional price parity, population, density) have shipped. R6 — estimated edges via price-parity extrapolation — is the next modeled link and is deliberately held until the cost-of-living↔labor-cost relationship is modeled rather than assumed 1:1.
Visual — (TBD — the widening roadmap as an observed-vs-projected coverage map).
8. Current status
Grounded in the real code state (contract 0.3.0, src/spokes/wage-benchmark/):
- Shipped: national observed anchor (1,465 SOCs, BLS OEWS 2024); observed sub-national for 37 SOCs across 1,820 state and 11,670 metro cells; cost-of-labor index for 444 areas; the observed-first basis ladder with honesty flags; scope-aware nested ranges (R5); the SOC-independent geographic-drivers endpoint (R7). Live routes:
GET /benchmark,GET /drivers,GET /coverage,GET /health. MCP tools registered. - In flight / planned: R6 estimated edges via price-parity extrapolation; broader observed sub-national coverage beyond 37 SOCs (the widening queue); a fuller market-range UI on the consuming surfaces.
Visual — Tier A (live capture). GET /api/spokes/wage-benchmark/coverage and /health report the real shipped coverage at request time.
Worked example used above is real output of the spoke's estimateHourlyBenchmark core against the in-repo BLS OEWS 2024 seed: Marketing Managers (SOC 11-2021) national $77.42/hr observed → Arizona $65.35/hr observed (wrapped in its national container) → and, for a role with no local observation, Top Executives (11-1000) in Mississippi $39.84/hr, explicitly projected at 0.79× the national anchor. No figure here is invented.