Diagnostic & Practice

The diagnostic and practice subsystems form the student-facing measurement loop. A diagnostic session collects graded responses, feeds them to the engine which produces a calibration-pinned score (θ) and standard error, and the resulting output drives the Skill-Status Engine and, eventually, Student Profiles and Intervention Bundles. Practice then runs an adaptive loop seeded from the completed diagnostic.

Both subsystems live in modules/session. The engine lives in modules/engine.

The diagnostic session

Lifecycle

A diagnostic session (assessment_session) is the mutable container that tracks a student’s sitting. There is at most one open session per student per assessment cycle; attempting to start a second returns 409 with the existing session id.

The lifecycle flows through five states:


started → in_progress → finished
                ↓
          time_capped   (auto-finish at the 15-minute active-time cap)

Status history is append-only. The session header itself is mutable (it must update as responses arrive), but every status transition writes a assessment_session_status_history row. The audit trail is never deleted or edited.

Starting a session


curl -X POST https://localhost:3000/api/diagnostic-sessions/start \
  -H "Authorization: Bearer <accessToken>" \
  -H "Content-Type: application/json" \
  -d '{ "subSkillId": "PHON-01", "assessmentWindowId": "BOY" }'

Requires USE_DIAGNOSTIC_ENGINE. The server places an advisory lock per (org, studentProfileId) so concurrent start requests for the same student are serialized rather than racing to a DB constraint.

Submitting responses

POST /api/diagnostic-sessions/:id/responses accepts a selectedOption only; the client never supplies isCorrect or a prior score. The server reads the correct answer from the item bank and grades it internally. This is a fundamental integrity guarantee: a forged response payload cannot inflate a score or steer θ.

The server scores one sub-skill at a time. For multi-sub-skill sittings the client re-calls /responses with the next subSkillId after the prior sub-skill closes.

The time cap

Active time (total elapsed minus pauses and audio replays) is tracked in cap.ts. At or above 900,000 ms (15 minutes) the next /responses call auto-finishes the session with sessionEndReason: time_cap. A student is never penalized for pausing or re-listening; only active time counts toward the cap.

Grade-1 audio-first

For Grade-1 students, the server reads student.gradeLevel to select audio items. It never reads class.grade inside the session module, enforced by a CI lint (lint:session-grade-filter). If no audio item is available for the sub-skill, firstItemId is null; the session does not fall back to a text item.

Browser-cache resilience

Every student-facing session exposes a state/resume round-trip (B.5 mandatory rule). After POST /api/diagnostic-sessions/:id/resume the server revalidates ownership, tenant, and the 24-hour window from the DB’s lastTouchedAt, not the token alone. A stolen token is useless cross-session.

The GET /api/diagnostic-sessions/:id/state endpoint returns an HMAC-signed state token the front-end should store in localStorage or sessionStorage. On refresh, tab-switch, or device-lock the client presents the token to /resume to restore the sitting at its exact step.

The finish endpoint returns θ + standard error on a student-scoped token. This is the only surface where θ is visible on a student-authenticated response; it is teacher-decision input. Every other student response object is stripped of θ, standard error, data sufficiency, skill status, and profile/bundle/scaffold information by the allow-list serializer.

The 6 diagnostic routes

Method	Path	Auth
`POST`	`/api/diagnostic-sessions/start`	`USE_DIAGNOSTIC_ENGINE` + own-student
`GET`	`/api/diagnostic-sessions/:id/state`	`USE_DIAGNOSTIC_ENGINE` + own-student
`POST`	`/api/diagnostic-sessions/:id/resume`	`USE_DIAGNOSTIC_ENGINE` + own-student
`POST`	`/api/diagnostic-sessions/:id/responses`	`USE_DIAGNOSTIC_ENGINE` + own-student
`POST`	`/api/diagnostic-sessions/:id/finish`	`USE_DIAGNOSTIC_ENGINE` + own-student
`GET`	`/api/diagnostic-sessions/:id`	`USE_DIAGNOSTIC_ENGINE` + own-student OR ClassTeacher read

The engine (θ scoring)

The engine module (modules/engine) is the measurement loop. It accepts a batch of graded responses, applies the calibration-pinned scoring formula, writes an immutable diagnostic_session audit row, and returns the new θ, standard error, and the next item to serve.

No LLM in the measurement loop

No language model is ever involved in θ computation, item selection, or the scoring of a response. The formula is deterministic and sealed: the same inputs always produce byte-identical θ to 4 decimal places (V-6 replay guarantee). θ is stored as Decimal(7,4) rather than a float to eliminate cross-machine drift.

Two-track δ selection

Each item has a deltaPrior (computed at import time) and, eventually, a deltaCalibrated value (written by WP-CAL). The engine uses calibrated δ once a calibration trust threshold of 300 comparable responses is met; below that it falls back to the prior. The snapshot of resolved δ values is frozen on the audit row so that a V-6 replay reads exactly what the engine used, not the live item_bank value that a later calibration run might have overwritten.

Item selection (CAT)

The engine selects the next item by maximum Fisher information at the student’s current θ ± 0.5 logits. Ties are broken lexicographically by itemId for determinism. Already-served items within the sitting are excluded.

The 4 engine routes

Method	Path	Auth
`POST`	`/api/engine/score`	`USE_DIAGNOSTIC_ENGINE` + own-student (for student persona)
`GET`	`/api/engine/sessions/{id}`	`USE_DIAGNOSTIC_ENGINE` + org scope
`GET`	`/api/engine/sessions/by-student/{studentId}`	`USE_DIAGNOSTIC_ENGINE` + own-student
`POST`	`/api/engine/replay/{sessionId}`	`RUN_BACKOFFICE`

replay re-runs a session’s pinned inputs without writing anything. It is the V-6 audit path.

Output feeds the skill-status engine

When a diagnostic finishes, the server writes two things: the per-sub-skill θ rows (student_assessment_result) and a count-based screening_score evidence row derived from the server-graded responses. The screening_score row is what the Measurement Bridge writes into the SSE’s evidence table and what the SSE recompute consumes on its next run. The SSE reads that evidence alongside CBM probes and other evidence types to produce a SkillStatus and DomainStatus for each sub-skill and domain. Those statuses then drive profile resolution and, ultimately, an intervention bundle recommendation.

Adaptive practice (WP-05)

Practice is seeded from a completed diagnostic and runs a per-sub-skill adaptive loop.

Seeding a practice queue

POST /api/practice/queues/seed/:diagnosticSessionId builds an ordered practice_queue from the completed diagnostic’s θ rows. Only sub-skills marked weak or monitor are included (sub-skills with null θ (insufficient data) are excluded per GSR-01). The seed also initializes a durable student_practice_band per sub-skill, seeded from the diagnostic θ. If a band already exists for that sub-skill it is left unchanged (carry-over rule).

In Wave 1 each queue holds exactly one sub-skill (the weakest-eligible by θ). When a student has several weak sub-skills they re-seed a new queue once the current one is exhausted.

CAT item selection in practice

GET /api/practice/queues/current/next-item applies the same max Fisher information at θ ± 0.5 logits strategy as the diagnostic engine. Items already served in the current queue are excluded. The active queue is derived from the authenticated user’s own profile; there is no :studentId path parameter, which prevents IDOR.

Submitting and band routing

POST /api/practice/queues/current/submit-response grades the response, updates θ, and appends a practice_response row. When a block of at least 8 items is complete the 80/60 routing rule runs on the durable band:

Block accuracy	Action
≥ 80%	Advance band (step up)
60-79%	Hold band
< 60%	Regress band (step down)

Band mutations write to student_practice_band_history. The band persists across re-seeded queues; a student who returns after a week starts from where they left off.

State restore for practice

GET /api/practice/queues/current/state returns the same HMAC-signed token as the diagnostic state endpoint. The client stores it and presents it on the next session open to restore position in the queue, with the same 5-disruption-type resilience (refresh, app-switch, lock, WiFi drop, forced close) required for all student-facing flows.

The 4 practice routes

Method	Path	Auth
`POST`	`/api/practice/queues/seed/:diagnosticSessionId`	`USE_DIAGNOSTIC_ENGINE` + own-student
`GET`	`/api/practice/queues/current/next-item`	`USE_DIAGNOSTIC_ENGINE` + own-student
`POST`	`/api/practice/queues/current/submit-response`	`USE_DIAGNOSTIC_ENGINE` + own-student
`GET`	`/api/practice/queues/current/state`	`USE_DIAGNOSTIC_ENGINE` + own-student

What never reaches the student

The toStudentSessionDTO allow-list serializer strips the following from every student-facing response:

θ and standard error
dataSufficiency
SkillStatus and DomainStatus
Primary profile, modifier profile, and sub-flag identifiers
Bundle and scaffold-tier information
Difficulty-band internals

The diagnostic finish response and the teacher-facing session summary are the only θ-adjacent surfaces, and the summary deliberately omits raw θ.

Re-diagnostic

A re-diagnostic is a new sitting in a later assessment window, not a fork of the original machinery. The same orchestrator path runs, but with two additional guards at the start.

14-day gate. POST /api/diagnostic-sessions/start/re-diagnostic checks whether at least 14 days have elapsed since the most-recent completed diagnostic for the same student and cycle (measured from finishedAt, not startedAt). A request inside the gate returns 409 with reason: rediag_too_soon and an earliestAllowed timestamp. If no prior completed diagnostic exists, the gate is waived and the session opens normally.

Prior queue archival. Before the new sitting opens, all open practice queues for the student are archived (archivedAt set, status changed to abandoned). Archival goes through the mutateQueue single write path. If the one-active sitting invariant fires first (an open sitting already exists for the cycle), the archival is skipped entirely to prevent silent data loss on the student’s practice plan.

The comparison between the prior and new sittings is a read-side concern handled by the reporting module.

Skill-Status Engine: consumes θ rows from finished diagnostics
Task Delivery: serves the actual item content for each item ID
Skills Taxonomy: defines the 79 sub-skills diagnostics measure
Progress Monitoring: tracks practice outcomes over time
Getting Started: end-to-end developer setup