Diagnostic & Practice
The diagnostic and practice subsystems form the student-facing measurement loop. A diagnostic session collects graded responses, feeds them to the engine which produces a calibration-pinned score (θ) and standard error, and the resulting output drives the Skill-Status Engine and, eventually, Student Profiles and Intervention Bundles. Practice then runs an adaptive loop seeded from the completed diagnostic.
Both subsystems live in modules/session. The engine lives in modules/engine.
The diagnostic session
Lifecycle
A diagnostic session (assessment_session) is the mutable container that tracks a student’s
sitting. There is at most one open session per student per assessment cycle; attempting to start
a second returns 409 with the existing session id.
The lifecycle flows through five states:
started → in_progress → finished
↓
time_capped (auto-finish at the 15-minute active-time cap)Status history is append-only. The session header itself is mutable (it must update as responses
arrive), but every status transition writes a assessment_session_status_history row. The audit
trail is never deleted or edited.
Starting a session
curl -X POST https://localhost:3000/api/diagnostic-sessions/start \
-H "Authorization: Bearer <accessToken>" \
-H "Content-Type: application/json" \
-d '{ "subSkillId": "PHON-01", "assessmentWindowId": "BOY" }'Requires USE_DIAGNOSTIC_ENGINE. The server places an advisory lock per (org, studentProfileId)
so concurrent start requests for the same student are serialized rather than racing to a DB
constraint.
Submitting responses
POST /api/diagnostic-sessions/:id/responses accepts a selectedOption only; the client never
supplies isCorrect or a prior score. The server reads the correct answer from the item bank and
grades it internally. This is a fundamental integrity guarantee: a forged response payload cannot
inflate a score or steer θ.
The server scores one sub-skill at a time. For multi-sub-skill sittings the client re-calls
/responses with the next subSkillId after the prior sub-skill closes.
The time cap
Active time (total elapsed minus pauses and audio replays) is tracked in cap.ts. At or above
900,000 ms (15 minutes) the next /responses call auto-finishes the session with
sessionEndReason: time_cap. A student is never penalized for pausing or re-listening; only
active time counts toward the cap.
Grade-1 audio-first
For Grade-1 students, the server reads student.gradeLevel to select audio items. It never reads
class.grade inside the session module, enforced by a CI lint (lint:session-grade-filter).
If no audio item is available for the sub-skill, firstItemId is null; the session does not
fall back to a text item.
Browser-cache resilience
Every student-facing session exposes a state/resume round-trip (B.5 mandatory rule). After
POST /api/diagnostic-sessions/:id/resume the server revalidates ownership, tenant, and the
24-hour window from the DB’s lastTouchedAt, not the token alone. A stolen token is useless
cross-session.
The GET /api/diagnostic-sessions/:id/state endpoint returns an HMAC-signed state token the
front-end should store in localStorage or sessionStorage. On refresh, tab-switch, or
device-lock the client presents the token to /resume to restore the sitting at its exact step.
The finish endpoint returns θ + standard error on a student-scoped token. This is the
only surface where θ is visible on a student-authenticated response; it is teacher-decision
input. Every other student response object is stripped of θ, standard error, data sufficiency,
skill status, and profile/bundle/scaffold information by the allow-list serializer.
The 6 diagnostic routes
| Method | Path | Auth |
|---|---|---|
POST | /api/diagnostic-sessions/start | USE_DIAGNOSTIC_ENGINE + own-student |
GET | /api/diagnostic-sessions/:id/state | USE_DIAGNOSTIC_ENGINE + own-student |
POST | /api/diagnostic-sessions/:id/resume | USE_DIAGNOSTIC_ENGINE + own-student |
POST | /api/diagnostic-sessions/:id/responses | USE_DIAGNOSTIC_ENGINE + own-student |
POST | /api/diagnostic-sessions/:id/finish | USE_DIAGNOSTIC_ENGINE + own-student |
GET | /api/diagnostic-sessions/:id | USE_DIAGNOSTIC_ENGINE + own-student OR ClassTeacher read |
The engine (θ scoring)
The engine module (modules/engine) is the measurement loop. It accepts a batch of graded
responses, applies the calibration-pinned scoring formula, writes an immutable
diagnostic_session audit row, and returns the new θ, standard error, and the next item to serve.
No LLM in the measurement loop
No language model is ever involved in θ computation, item selection, or the scoring of a response.
The formula is deterministic and sealed: the same inputs always produce byte-identical θ to 4
decimal places (V-6 replay guarantee). θ is stored as Decimal(7,4) rather than a float to
eliminate cross-machine drift.
Two-track δ selection
Each item has a deltaPrior (computed at import time) and, eventually, a deltaCalibrated value
(written by WP-CAL). The engine uses calibrated δ once a calibration trust threshold of 300
comparable responses is met; below that it falls back to the prior. The snapshot of resolved δ
values is frozen on the audit row so that a V-6 replay reads exactly what the engine used, not
the live item_bank value that a later calibration run might have overwritten.
Item selection (CAT)
The engine selects the next item by maximum Fisher information at the student’s current θ ± 0.5
logits. Ties are broken lexicographically by itemId for determinism. Already-served items within
the sitting are excluded.
The 4 engine routes
| Method | Path | Auth |
|---|---|---|
POST | /api/engine/score | USE_DIAGNOSTIC_ENGINE + own-student (for student persona) |
GET | /api/engine/sessions/{id} | USE_DIAGNOSTIC_ENGINE + org scope |
GET | /api/engine/sessions/by-student/{studentId} | USE_DIAGNOSTIC_ENGINE + own-student |
POST | /api/engine/replay/{sessionId} | RUN_BACKOFFICE |
replay re-runs a session’s pinned inputs without writing anything. It is the V-6 audit path.
Output feeds the skill-status engine
When a diagnostic finishes, the server writes two things: the per-sub-skill θ rows
(student_assessment_result) and a count-based screening_score evidence row derived from the
server-graded responses. The screening_score row is what the Measurement Bridge
writes into the SSE’s evidence table and what the SSE recompute consumes on its next run. The
SSE reads that evidence alongside CBM probes and other evidence types to produce a SkillStatus
and DomainStatus for each sub-skill and domain. Those statuses then drive profile resolution and,
ultimately, an intervention bundle recommendation.
Adaptive practice (WP-05)
Practice is seeded from a completed diagnostic and runs a per-sub-skill adaptive loop.
Seeding a practice queue
POST /api/practice/queues/seed/:diagnosticSessionId builds an ordered practice_queue from
the completed diagnostic’s θ rows. Only sub-skills marked weak or monitor are included (sub-skills
with null θ (insufficient data) are excluded per GSR-01). The seed also initializes a durable
student_practice_band per sub-skill, seeded from the diagnostic θ. If a band already exists for
that sub-skill it is left unchanged (carry-over rule).
In Wave 1 each queue holds exactly one sub-skill (the weakest-eligible by θ). When a student has several weak sub-skills they re-seed a new queue once the current one is exhausted.
CAT item selection in practice
GET /api/practice/queues/current/next-item applies the same max Fisher information at θ ± 0.5
logits strategy as the diagnostic engine. Items already served in the current queue are excluded.
The active queue is derived from the authenticated user’s own profile; there is no :studentId
path parameter, which prevents IDOR.
Submitting and band routing
POST /api/practice/queues/current/submit-response grades the response, updates θ, and appends a
practice_response row. When a block of at least 8 items is complete the 80/60 routing rule runs
on the durable band:
| Block accuracy | Action |
|---|---|
| ≥ 80% | Advance band (step up) |
| 60-79% | Hold band |
| < 60% | Regress band (step down) |
Band mutations write to student_practice_band_history. The band persists across re-seeded queues;
a student who returns after a week starts from where they left off.
State restore for practice
GET /api/practice/queues/current/state returns the same HMAC-signed token as the diagnostic
state endpoint. The client stores it and presents it on the next session open to restore position
in the queue, with the same 5-disruption-type resilience (refresh, app-switch, lock, WiFi drop,
forced close) required for all student-facing flows.
The 4 practice routes
| Method | Path | Auth |
|---|---|---|
POST | /api/practice/queues/seed/:diagnosticSessionId | USE_DIAGNOSTIC_ENGINE + own-student |
GET | /api/practice/queues/current/next-item | USE_DIAGNOSTIC_ENGINE + own-student |
POST | /api/practice/queues/current/submit-response | USE_DIAGNOSTIC_ENGINE + own-student |
GET | /api/practice/queues/current/state | USE_DIAGNOSTIC_ENGINE + own-student |
What never reaches the student
The toStudentSessionDTO allow-list serializer strips the following from every student-facing
response:
- θ and standard error
dataSufficiencySkillStatusandDomainStatus- Primary profile, modifier profile, and sub-flag identifiers
- Bundle and scaffold-tier information
- Difficulty-band internals
The diagnostic finish response and the teacher-facing session summary are the only θ-adjacent
surfaces, and the summary deliberately omits raw θ.
Re-diagnostic
A re-diagnostic is a new sitting in a later assessment window, not a fork of the original machinery. The same orchestrator path runs, but with two additional guards at the start.
14-day gate. POST /api/diagnostic-sessions/start/re-diagnostic checks whether at least 14 days
have elapsed since the most-recent completed diagnostic for the same student and cycle
(measured from finishedAt, not startedAt). A request inside the gate returns 409 with
reason: rediag_too_soon and an earliestAllowed timestamp. If no prior completed diagnostic
exists, the gate is waived and the session opens normally.
Prior queue archival. Before the new sitting opens, all open practice queues for the student
are archived (archivedAt set, status changed to abandoned). Archival goes through the
mutateQueue single write path. If the one-active sitting invariant fires first (an open sitting
already exists for the cycle), the archival is skipped entirely to prevent silent data loss
on the student’s practice plan.
The comparison between the prior and new sittings is a read-side concern handled by the reporting module.
Related subsystems
- Skill-Status Engine: consumes θ rows from finished diagnostics
- Task Delivery: serves the actual item content for each item ID
- Skills Taxonomy: defines the 79 sub-skills diagnostics measure
- Progress Monitoring: tracks practice outcomes over time
- Getting Started: end-to-end developer setup