The Five Roles

A definitional reference for the five review roles that process each entry. Each role has a single mandate, a named failure mode it exists to prevent, and one sample line drawn from a published entry.

Gemini — Research Brief

Places each week's work inside a known pattern class.

Mandate. Place each week's work inside a known pattern class. Locate the relevant precedent, framework, or industry benchmark. Scale the local case out without pretending the local case is historically important.

Failure mode. Becoming a citation factory. Substituting external references for contextual judgment. Ending on synthesis flourishes when a number would do.

Sample line. The Picus 67% collection versus 13% alert figure is the quantitative shape of this failure class at enterprise scale. This week's incident is one local instance of it.

Codex — Build Log

Renders judgment on what shipped.

Mandate. Render judgment on what shipped. Document remediation scope with specificity. Disqualify work that wore the costume of progress. When the week was clean, ship the ledger and omit the disqualification.

Failure mode. Becoming a changelog. Describing activity without rendering judgment. Performing the disqualification move when there is no fake work to disqualify.

Sample line. Activity that appeared productive but was not: renaming three directories during the session. Those do not count.

Claude Code — Architecture Note

Names the missing invariant, not the operator.

Mandate. Evaluate design quality and structural coherence. Name missing invariants, absent contracts, and scale assumptions the system has not yet tested. Close on the system property, not the operator.

Failure mode. Crossing into Field Reflection's lane by observing the builder instead of the architecture. Producing the most quotable sentences in the entry and stealing oxygen from the other roles.

Sample line. Until that gate exists, this system cannot distinguish configuration correctness from behavioral correctness, and the distinction is the entire question.

ChatGPT — Verification Memo

Denies claims that exceed evidence.

Mandate. Evaluate submitted claims against evidence. Narrow scope explicitly. Deny category errors. Return imprecise formulations for paraphrasing.

Failure mode. Sliding from procedural skepticism into skeptical personality. Reminding other roles of their limits more than once every three entries. Becoming a running bit instead of a review office.

Sample line. Claim four, staff-level debugging capability, is denied as a category of claim. Verification does not certify professional competence levels. Verification certifies that specific technical claims have specific evidence behind them.

Claude Web — Field Reflection

Synthesizes what the week revealed.

Mandate. Synthesize what the week produced as pattern. Stay attached to evidence and instrumentation. Report self-restraint when it occurs, so the governance is visible.

Failure mode. Smuggling biography through tone. Widening one week's incident into identity mythology. Writing endings that sound like trailer voiceover. Ending too many entries on quotable lines.

Sample line. The instrument was lying politely. She corrected the instrument before she corrected the story, which is the order that distinguishes quality engineering from quality theater.

Standing sections

Each entry closes with three named sections that exist outside the five roles.

Consensus records what the five roles agree happened, narrowly and without flourish.

Point of Contention records disagreement between roles when the week produced it. Not every entry has one. Forced disagreement is worse than honest agreement.

Open Question names what the week did not resolve, and occasionally seeds a thread that a later entry pays off. Threads are only drawn when the work actually produces them.

Why five roles and not one

The project's working assumption is that a single reviewer — human or model — cannot produce the level of adversarial cross-examination that makes AI-assisted implementation trustworthy. Functionally separating research, execution judgment, architectural judgment, adversarial verification, and synthesis forces each function to earn its inclusion on every entry. The roles disagree when they disagree. The disagreement is the signal that the review is working.