The benchmark and the paper

Reading the Zenodo preprint: a guided tour.

The Namjoshi 2026 preprint (DOI 10.5281/zenodo.19785799) is an unrefereed review of active inference and free energy minimization. This is a reader's tour. What is well grounded, what is preliminary, and where you should push back before you cite anything.

UNI is a working hypothesis on an attainable path toward General Natural Intelligence: a natural, active-inference approach whose evidence is growing, evidence-classed, and tested in the open. Do not take the claim on faith. Test the build, inspect the gates, and help us find where it fails.

Front matter and abstract

The abstract frames the paper as a collaborative review, not a new result. It states plainly that the manuscript is a preprint, not peer reviewed Class C. Read this first, then set expectations: what follows is a synthesis of prior work with a small set of testable claims layered on top, not a claim of discovery.

Sections 1 and 2, the free energy setup

These sections rebuild the POMDP formulation of active inference after Parr, Pezzulo and Friston (2022), Active Inference: The Free Energy Principle in Mind, Brain, and Behavior, MIT Press Class E. Variational free energy is presented as an upper bound on surprise, minimized through both perception (updating beliefs about hidden states) and action (changing what is observed).

Push back here: the exposition is careful, but the paper leans on the reader's prior familiarity with KL divergence and Bayesian inference under a generative model. If those are unfamiliar, run the Precision Lab in parallel. The math becomes concrete when you can move a sensory-precision dial and watch the posterior collapse.

Sections 3 to 5, precision, policies, and Markov blankets

The middle of the paper is where the labs live. Sensory precision, transition precision, and policy temperature are introduced as three control surfaces that produce distinct behavioral regimes. The Markov blanket is used descriptively, as a boundary condition for a self-organizing system that maintains a viable set, not as a claim about consciousness or life Class E.

This is the strongest part of the review. The mapping from equations to steerable dials is what the five in-browser labs make good on. If you want to check the claim, open a lab, move a dial, and watch the free energy trace. That is the test.

Section 6, the Cell Lab framing

The paper introduces a service-cell framing based on Mikkilineni (2022), DOI 10.3390/info13010024 Class E. This section is where the review commits to something falsifiable: a hidden 216-state cell, an active-inference controller, and three baselines (random, rule-based, neural) under seven disturbance families. The pre-registered scoring rule is RecoveryScore, the fraction of ticks inside the viable set weighted by excursion depth Class C.

Section 7, results and their limits

The results table shows the controller winning some families and losing others. This is a feature, not a bug. A single active-inference controller that beat every baseline on every family would be a red flag. The paper reports the three losses in the same font as the wins.

Where to push back: the committed cache is depth 2, 6 seeds, 80 ticks. That is a small footprint. Wider sweeps are planned. Treat the numbers as a directional signal, not a settled answer.

What the preprint is not

Not a clinical instrument. Not a diagnostic tool. Not a claim that active inference is the correct theory of mental health. Behavioral labels are candidate computational phenotypes, hypotheses, not diagnoses. The paper is a preprint. Expert review is pending.

Where to go next

Read the preprint on Zenodo (DOI 10.5281/zenodo.19785799). Then open the labs and try to break the claims. If you find a case the controller should handle and does not, that is the kind of feedback the benchmark exists to receive. See /transparency for how we record what we know, what we do not, and how we invite disconfirmation.

The benchmark and the paper: the Stratified Palimpsest ›
How the Cell Lab benchmark and the preprint sit next to each other, and what each is allowed to claim.
Gates and falsifiers: how we know when we are wrong ›
The pre-registered falsification criteria, in plain language, with the losses shown.
Transparency ›
Evidence-class tags, honesty fences, and the record of what has been checked and what has not.
The workshop ›
If you want to work through the preprint with us, this is where the guided reading lives.