Releasing irg-reference: An Open Implementation of the IRG Protocol
Today we’re releasing irg-reference, an open implementation of the Iterative Reasoning Graph protocol. It is a JavaScript runtime that executes IRG graphs against seven LLM providers, plus a Next.js trace navigator that turns every reasoning run into an inspectable artifact. The license is CC BY-SA 4.0, including for commercial use. If you have argued for AI reasoning that is structured, externalized, and auditable, this is the artifact that lets you stop arguing and start running.
Why a Reference Implementation Now
For most of the last year we have been writing about a thesis: that the reasoning a language model performs should not evaporate at the moment of token generation. It should be a first-class artifact — structured, executable, persistent, and inspectable. We have called the protocol that makes this possible the Iterative Reasoning Graph. We have made the case for why logging is not governance, why model validation breaks down without reasoning traces, and why regulatory frameworks from SR 11-7 to the Colorado AI Act increasingly require documentation of the decision process, not just the decision.
Arguments are useful. Running code is more useful. A protocol that exists only in writing can be debated; a protocol with a working reference implementation can be tried. The thing we wanted — that anyone deploying AI in a consequential decision context could see, run, modify, and verify the reasoning architecture for themselves — required the implementation to exist in the open.
That is what we are publishing today.
What “Reasoning as an Artifact” Actually Looks Like
The hardest thing to communicate about IRG, in writing, is the difference between describing a reasoning trace and seeing one. A trace is a graph of nodes — clarify, strategy, adversary, arbiter, fact-check, impact, draft, meta-evaluation, assessor, convergence — each with its inputs, its outputs, the prompts that shaped it, the model that ran it, the iteration it belonged to, and the revisions it triggered downstream. That structure is not metadata about the reasoning. It is the reasoning, made externally observable.
The trace navigator that ships with this release is, mostly, a way to make that real. Run a query against the IRG runtime; the navigator shows you, step by step, what the system planned, what the adversary node challenged, where the fact-check pipeline disagreed, when confidence was raised or revised, and how the convergence node decided the run was complete. The output is not a chat transcript. It is a record of how the system thought.
This is the form that auditability has to take if it is going to be more than logging. A regulator, a model risk team, a clinical safety reviewer, or an engineer trying to debug a bad answer all need the same thing: not the final output, but the path the system took to get there. The trace is that path.
What Ships in the Repo
The release has three pieces.
The first is api-impl-js/, a JavaScript runtime built around an Express server. It exposes two endpoints — POST /webhook/irg-process for full IRG runs and POST /webhook/fact-check-process for the fact-check pipeline used standalone — and ships with two graph variants out of the box. irg-simple is the ten-node default flow that runs the full reasoning cycle without external retrieval. irg-full adds an external fact-check pipeline that performs source generation and citation writing, with provisional artifacts persisted under _fact-store/ for review and reuse.
The runtime is multi-provider by design. Seven LLM providers are wired in: Groq, OpenAI, Anthropic, Mistral, Google, Together, and Ollama for local inference. Provider selection happens per-request, either explicitly or inferred from the model name’s prefix. This is deliberate. IRG is a protocol, not a model wrapper, and the implementation should make that obvious by treating the underlying model as a configurable concern rather than a fixed assumption.
The second piece is trace-navigator/, a Next.js application for inspecting reasoning runs. It is the part that turns a JSON trace into something a human can actually read. You point it at a run, walk through nodes in order, and see the prompts, completions, fact-check results, confidence assessments, and revision links between steps. If you have ever wanted to know what an AI system was actually doing inside a reasoning loop, this is the view.
The third piece is _docs/, the design notes that motivate the protocol. The IRG and EIE writeups are included, along with the Epistemic Integrity Evaluation question set and a flow diagram for the graph itself. These are the source documents the implementation was built from. Reading them is not a prerequisite for running the code, but the architectural choices in the runtime are easier to understand once the protocol document has been seen.
Trying It
If you want to run the full stack locally, the path is short. Clone the repo, copy .env.example to .env, fill in at least one provider API key, then run npm install && npm run api in api-impl-js/ and npm install && npm run dev in trace-navigator/. The API listens on port 2100; the navigator on port 2000. From there, a single curl call against the IRG endpoint will produce a full trace, and the navigator will render it.
The README documents the request and response shapes, the provider routing rules, the per-request configuration knobs (max iterations, confidence threshold, fact-check on/off, impact prediction on/off, assessor on/off), and the structure of the trace data the runtime emits. None of that is hidden behind a CLA or an enterprise tier. It is in the repo.
One important caveat. This is a reference implementation. There is no authentication on the API server and no rate limiting. If you expose it to the public internet without a proxy in front of it, a bad actor can drain your provider quotas. The SECURITY.md file in the repo is short and we recommend reading it before deploying anywhere reachable.
What This Is and Is Not
This release is the protocol made concrete. It is the architecture that has been described in the IRG and EIE writeups, written down as code, with enough surface area to be useful for experimentation, for academic work, for teams trying to understand what externalized reasoning actually buys them in practice, and for engineers building systems that need to be auditable by regulators or by their own risk functions.
It is not the commercial product. Arcus Labs publishes a separate product line built on the same ideas — the Reason cognitive engineering language and its compiler, the production EIE monitoring pipeline, the institutional governance tooling for regulated deployments. That work is licensed under different terms and lives in different repositories. The reference implementation exists because we believe the protocol layer should be open, that organizations should be able to verify the reasoning architecture of any system they deploy, and that the conversation about AI accountability is better when there is shared running code to point at.
The License
The repository is licensed under Creative Commons Attribution-ShareAlike 4.0 International. You can use the code commercially. You can modify it. You can fork it. You can build products on it. The two requirements are attribution and that derivative works carry the same license — the standard ShareAlike provision. We chose CC BY-SA because the protocol benefits from being shared, and because we want forks and improvements to flow back into the commons rather than disappearing into closed derivatives.
What We Want From the Community
Three things. First, run it. The fastest way for the protocol to improve is for it to encounter problems we have not thought of, and that requires real workloads beyond our own. Second, file issues and pull requests. The graph definitions in graphs/, the node primitives in core/, and the trace navigator are all places where small improvements compound quickly. Third, write about what you find. The argument that AI reasoning should be structured and inspectable is stronger when more people have hands-on experience with what that actually looks like.
The protocol layer of AI reasoning should be open, inspectable, and shared. irg-reference is our contribution to making it so.