CLM-L032 — Tagger architecture (three-signal fusion + OOD gates)

Status: 🔒 Locked (legacy) · 🔍 Practitioner-grounded · Falsifiable ✓ — operational in diagnostics/engine/careers/tagger/; not yet integrated into THEORY-OF-TRAITS.md

Topic: 09-tagging-and-clusters

CLAIM TEXT

The framework's career tagger does not score from a single signal; it fuses three signals, gates with out-of-distribution checks, and emits structured evidence with every score. The three-signal fusion:

final_score = round(
    cluster_prior   × 0.20 +
    embedding_score × 0.50 +
    rule_score      × 0.30
)
clamped to [1, 10]

The components:

Cluster prior (weight 0.20) — the mean MI/MN profile of the career's cluster (when known) from the 649-career gold set. Falls back to corpus mean if cluster is unknown. Role: prevents wild outliers when the local signal is sparse.

Embedding similarity (weight 0.50) — embed the career description with sentence-transformers (all-mpnet-base-v2), find the k=10 nearest neighbors among the 649 gold careers, take the similarity-weighted average of their scores. Role: captures semantic similarity that surface vocabulary misses.

Rule adjustments (weight 0.30) — the linguistic-frequency rubric (CLM-L031) applied as bounded deltas: each HIGH-predictive term adds up to (ratio / max_ratio) × 2 points; each LOW-predictive term subtracts the same. Per-dimension delta capped at ±3 by governance. Role: injects auditable, term-level evidence into the score.

The framework's structural claims about the architecture:

No single signal is trusted alone. Each component has known failure modes (cluster prior smooths over real differences; embeddings hallucinate similarity for rare careers; rules over-fire on common vocabulary). The fusion is the discipline.
Out-of-distribution gates are mandatory. Two checks: (a) top-1 neighbor similarity below the 5th percentile of in-sample top-1 similarities; (b) mean-10 neighbor similarity below the 5th percentile of in-sample mean-10 similarities. If either trips, the case is routed to mandatory human review rather than auto-scored.
Confidence is computed, not asserted. High (≥0.85) requires neighbor-score std < 1.0 and top-1 similarity > 0.7. Medium (0.65–0.85) is more permissive. Low (< 0.65) is anything else. Confidence travels with the score.
Evidence is structured, not narrative. Every output includes: the k=10 neighbors with similarities, the rules that fired with their term-and-ratio, the cluster prior contribution, the OOD flags, and the confidence level. The score is auditable end-to-end.

The framework's load-bearing methodological commitment:

> Every diagnostic score is an evidenced score. A score without traceable evidence is a hallucination, not a measurement.

This is the framework's discipline against AI-assisted-but-unauditable scoring — particularly relevant as embeddings replace authored rubrics across helping professions.

LOCATION (pre-adoption)

diagnostics/engine/careers/tagger/README.md (architecture documentation)
diagnostics/engine/careers/tagger/engine.py (operationalization)
diagnostics/engine/careers/tagger/evaluate.py (validation harness)
diagnostics/engine/careers/CAREER-TAGGER-OPS.md (operational runbook)

LOCATION (post-adoption, when integrated)

Not yet integrated into THEORY-OF-TRAITS.md. Recommended cherry-pick: a Tagging & Methodology sub-section paired with CLM-L031 (frequency rubric) and CLM-L033 (cluster lineage), naming the three-signal fusion, the OOD gate, and the structured-evidence requirement.

EVIDENCE TYPES

[P] Phenomenological

Strong practitioner observation. Tagger outputs with structured evidence are reviewable in seconds (a glance at neighbors and rules-fired permits a sanity check); outputs without evidence are not reviewable at all. Practitioners using the tagger report higher confidence with low-confidence flags than with high single-number outputs from black-box scoring.

[E] Empirical

The 649-career gold set provides both training data and validation substrate.
OOD percentile thresholds are computed from in-sample distributions, not hand-set.
MISSING — held-out cross-validation accuracy by dimension and confidence band.
MISSING — agreement study between tagger output and independent practitioner scoring on a held-out career sample.
MISSING — error analysis on OOD-flagged cases to confirm gates trip on genuinely novel inputs.

[T] Theoretical

Compatible with CLM-L031 (frequency rubric): rule_score is the rubric operationalized.
Compatible with CLM-L025 (combinatorial profile space): per-dimension scoring preserves multiplicative structure rather than collapsing to type prediction.
Compatible with structural-attribution canon (CLM-L020 personalization error): the framework's commitment that diagnostic outputs cite their evidence is the same discipline against unfounded attribution applied to the tooling layer.
Convergent with ensemble methods, retrieval-augmented generation, and explainable-AI traditions.

[C] Convergent

Ensemble methods (Dietterich 2000) — weighted combination of weak learners outperforms any single learner; structural parallel.
Retrieval-augmented generation (Lewis et al. 2020) — embedding-similarity retrieval with cited sources; convergent on the structured-evidence claim.
Out-of-distribution detection literature (Hendrycks & Gimpel 2017) — explicit OOD gates; structural parallel.
Explainable AI (LIME, SHAP) — local feature attributions for opaque scores; convergent on the auditability claim.
MISSING — convergent rs- entries on ensemble methods, RAG, OOD detection, and XAI literatures.

UPSTREAM SOURCES

diagnostics/engine/careers/tagger/README.md.
diagnostics/engine/careers/CAREER-TAGGER-OPS.md.

POSITIONING IN LITERATURE

Confirms: ensemble methods, RAG architectures, OOD detection literature, explainable-AI commitments.
Extends: applies the ensemble + OOD + structured-evidence stack to a psychometric-adjacent scoring task (19-dimensional human capacity/engagement profile), where the tradition has historically defaulted to single-instrument authored rubrics or black-box ML. The framework's contribution: a fusion architecture treating tagging as evidenced inference, not psychometric measurement.
Departs: from psychometric traditions where a single instrument's score is taken as the measurement; the framework's view is that single-instrument scores are unauditable in principle and must be replaced by evidenced-inference architectures whenever the cost of error is high.

FALSIFIABILITY

The three-signal fusion claim would be falsified if:

Any single signal (cluster prior alone, embedding alone, rules alone) outperforms the fusion on held-out careers.
The OOD gates fail to identify genuinely novel inputs in error analysis (i.e., gates either trip on in-distribution cases or fail to trip on OOD cases at high rates).
Confidence-band predictions are uncalibrated (high-confidence cases are wrong as often as low-confidence cases).
Practitioners shown structured evidence vs. score-only outputs make equivalent override decisions (i.e., evidence adds no operational value).

EDGE CASES / KNOWN LIMITS

Cluster prior is absent for novel clusters. Falls back to corpus mean, which sacrifices specificity. The framework treats this as accepted noise rather than failure.
Embedding model is fixed. The all-mpnet-base-v2 choice is a tractability decision; newer embeddings may shift behavior. The framework's claim is architectural (three signals + gates + evidence), not model-specific.
Weight choice (0.2/0.5/0.3) is empirical. The split was tuned against the 649-career gold set; other corpora may require re-tuning. Locked weights are a starting point, not a constant.
Bounded deltas (±3 per dimension) prevent runaway rule-firing but also prevent the rule layer from overriding clearly wrong embedding scores. The cap is a governance choice, not a derived value.

DISCONFIRMING CASES TRACKED

High-confidence outputs that practitioners override are logged in the review queue and feed back into evaluation. Persistent overrides on a dimension flag a calibration problem in that dimension's rules or the gold set's coverage.

REFLEXIVITY NOTE

The architecture reflects the originator's commitment to evidenced inference over authoritative pronouncement (CLM-L020 personalization error generalized to tooling). A practitioner expecting "the score" as the deliverable may experience the structured evidence as overhead; the framework's view is that the evidence is the deliverable; the score is a summary of the evidence.

RELATIONSHIP TO CURRENT CANON

Already integrated? No. THEORY-OF-TRAITS.md does not yet describe the tagger architecture.
Contradicts current canon? No.
Net-new? The three-signal fusion with explicit weights, the OOD gating discipline, the computed-confidence rule, and the structured-evidence requirement are net-new to master canon.
Recommended action: Cherry-pick a Tagging & Methodology sub-section into THEORY-OF-TRAITS.md naming the architecture and the evidenced-inference commitment. Pair with CLM-L031 and CLM-L033.

RESEARCH-BANK GAPS FLAGGED

For BACKLOG.md:

Dietterich (2000) — Ensemble Methods in Machine Learning.
Lewis et al. (2020) — Retrieval-Augmented Generation.
Hendrycks & Gimpel (2017) — OOD detection baselines.
Ribeiro et al. (2016) — LIME; explainable-AI feature attribution.
Lundberg & Lee (2017) — SHAP values.

NOTES

This claim is the framework's clearest statement of methodological discipline at the tooling layer. Worth elevating in any practitioner-facing or developer-facing documentation.
Pairs with CLM-L031 (the what of the rubric the rule layer implements) and CLM-L033 (the upstream gold layer the cluster prior reads from).

Citations · 0 research entries

No research entries linked yet. Gaps tracked in research/method/BACKLOG.md.

Related claims

← All claims

Tagger architecture — three-signal fusion with OOD gates and structured evidence