Introducing IRG: A Protocol for Persistent, Structured AI Reasoning
We’re releasing the Iterative Reasoning Graph specification — an open protocol for AI systems that reason in explicit, persistent, revisable structures rather than ephemeral token streams. The reasoning doesn’t disappear after the response. It lives in a graph you can inspect, replay, and audit. GitHub: arcus-labs/IRG-spec
The Problem
Modern AI systems reason in isolated episodes.
A model generates a response. The reasoning evaporates. You see the output but not the process. If something went wrong, you can’t point to where. If you want to improve it, you retrain or re-prompt and hope.
The techniques we’ve developed to address this — chain-of-thought, self-critique, tool orchestration — are workarounds, not architecture. They happen inside the prompt or get logged after the fact. They don’t persist. They don’t revise. They don’t compound.
What’s missing is a first-class representation of reasoning itself.
What IRG Is
An Iterative Reasoning Graph is an explicit, evolving graph of executable reasoning nodes.
Each node represents a reasoning operation: a draft, a critique, a fact-check, a revision, a risk assessment, a decision to abstain. Edges encode why nodes exist — dependency, refinement, invalidation, supersession.
This isn’t a log of what happened. It’s a structure that governs what happens next.
When an IRG node executes, it can:
- Invoke a generative model
- Query a memory system
- Call external tools
- Create, revise, or invalidate other nodes
The graph persists across time. Reasoning survives beyond a single response. Errors can be traced to specific nodes. Corrections are local — you fix the subgraph, not the whole system.
Why Graphs, Why Iteration
Graphs because reasoning isn’t linear. Real thinking branches, backtracks, and revises. A chain-of-thought is a transcript. A graph is a map.
Iteration because getting it right the first time is the exception. Good reasoning involves drafting, checking, critiquing, and refining. IRG makes iteration a first-class primitive, not something you simulate by prompting the model again.
A canonical IRG cycle:
- Clarify — surface missing assumptions
- Draft — generate an initial response
- Evaluate — fact-check, critique, assess coherence
- Predict — estimate downstream risks and failure modes
- Revise — apply targeted fixes
- Converge or iterate — stop when stable, abstain when stuck
Each step is a node. The whole process is inspectable.
What This Changes
Debugging
You can see exactly where reasoning failed and why.
Correction
Safety
Long-horizon reasoning
Minimal Compliance
The spec defines six requirements for IRG compliance:
- Persistent reasoning structure — reasoning survives beyond a single inference
- Executable nodes — reasoning steps are units that can run, revise, or invalidate
- Explicit relations — edges encode dependency, refinement, invalidation
- Iterative revision — critique and revision are first-class, not simulated
- Inspectability — the graph is exportable and auditable
- Termination semantics — convergence, abstention, and failure are explicit states
Systems that rely solely on linear chain-of-thought, stateless self-critique, or prompt-only orchestration don’t qualify — even if they exhibit iterative behavior. The structure has to be real.
What IRG Is Not
Not chain-of-thought. CoT is linear, ephemeral, and generated incidentally during inference. IRG is graph-structured, persistent, and explicitly authored.
Not a prompting technique. The IRG exists outside the prompt boundary. It governs model invocation, not the other way around.
Not a model architecture. IRG is model-agnostic. It can orchestrate transformers, diffusion models, symbolic systems, or external tools.
Not fine-tuning. IRG doesn’t modify weights. It encodes corrections into structure, not parameters.
Not just a log. Logs record what happened. IRG determines what happens next.
Relationship to EIE
Last week we released EIE — a protocol for measuring epistemic integrity in AI systems.
The relationship is simple:
- IRG structures process
- EIE evaluates outcomes
IRG provides architectural affordances that may improve epistemic behavior — explicit verification nodes, revision under critique, abstention as a first-class outcome. But IRG doesn’t guarantee good epistemic behavior. Poorly designed nodes or biased evaluators can still produce bad outcomes.
EIE measures whether the system actually behaved well, regardless of architecture. You can run EIE on IRG systems and non-IRG systems alike. If IRG is doing its job, EIE scores should reflect it.
What’s Coming
The spec released today defines the protocol. Next comes the implementation layer.
Reason is a cognitive engineering language that compiles to IRG. Instead of hand-wiring graphs, you write structured reasoning strategies — “thoughts” — that declare what operations to perform, what to verify, when to revise, when to stop.
Reason compiles to IRG. IRG executes against models and tools. The trace is fully auditable.
We’ll have more to share soon. If you want early access, reach out.
Why Open
We’re releasing IRG under CC-BY-4.0 because reasoning infrastructure shouldn’t be proprietary.
If AI systems are going to be trusted in high-stakes contexts — medicine, law, finance, security — the way they reason needs to be inspectable, comparable, and governable. That requires shared protocols, not walled gardens.
We’ll compete on implementation. The spec is open.
Get Involved
The spec is v0.3. We’re looking for:
- Implementers — build IRG-compliant systems, stress-test the spec
- Critics — find the gaps, edge cases, and failure modes
- Design partners — if you’re deploying AI in regulated or high-stakes domains and care about auditability, we want to talk
IRG treats reasoning as what it is: a structure that persists, evolves, and governs. Not a side effect of token generation. Not a log you review after the fact. A first-class artifact you can inspect, debug, and improve.
That’s the foundation. Now we build on it.