Generative Models: The Organism's Model of Its World, Universal Natural Intelligence

An organism does not receive the world. It guesses at the world, and then it checks. The math for that guess is called a generative model, and it is the object every active-inference agent carries inside itself. UNI is a working hypothesis on an attainable path toward General Natural Intelligence: a natural, active-inference approach whose evidence is growing, evidence-classed, and tested in the open. This post walks the model our labs actually run, and names the falsifier that would force us to revise it.

What a generative model actually is

A generative model is a joint probability over two things: the hidden states of the world (call them s) and the observations the organism can sense (call them o). Written together, that joint is P(o, s). The word "generative" is doing real work here. The model does not just classify observations after the fact. It says how observations are generated from causes, and it lets the organism run that machinery in reverse to infer the causes from the observations. Class E (Parr, Pezzulo, Friston, 2022, chapters 2 and 4).

Two pieces make up that joint, and both matter:

The prior P(s), the organism's beliefs about the world before it looks. Priors are not decoration. They carry evolutionary, developmental, and just-a-moment-ago context.
The likelihood P(o | s), the mapping from hidden causes to sensed effects. In a maze it says: given that I am in this cell, what walls should I feel. In a service cell it says: given that memory pressure is rising, what should my telemetry look like.

Perception, in this frame, is Bayesian inversion. The organism has P(s) and P(o | s). It observes o. It wants the posterior P(s | o). Bayes gives the exact answer in principle; active inference gives the tractable answer in practice, by approximating that posterior with a variational distribution and minimizing a bound called variational free energy.

Hidden states versus observations, why the split matters

The word "hidden" is precise. The agent never sees s, only o. That gap is the reason inference exists. The interesting cases live in the ambiguity: two states can produce similar observations (aliasing), and one state can produce very different observations under different noise regimes. Class E (Parr et al., 2022, section 2.8). In the Precision Lab this is visible on the dial: turn sensory precision down and the same wall pattern becomes consistent with more candidate positions; turn it up and belief sharpens onto one cell. Nothing about the world changed. What changed is how much the agent trusts its own eyes.

Factorization across time: the POMDP form

Real organisms live in time. A useful generative model has to say how states evolve and how policies (courses of action) shape that evolution. In the discrete-time formulation used across our five labs, the model factorizes as a partially observable Markov decision process:

P(o[1:T], s[1:T], u[1:T-1]) =
  P(s[1]) * prod_t P(o[t] | s[t]) * prod_t P(s[t+1] | s[t], u[t]) * P(u[1:T-1])

Read left to right. There is a prior over the starting state P(s[1]). At every step, the observation depends only on the current state, P(o[t] | s[t]). The next state depends only on the current state and the chosen action, P(s[t+1] | s[t], u[t]). And there is a distribution over policies, P(u[1:T-1]), which in active inference is shaped by expected free energy over future outcomes rather than by an external reward signal. Class E (Parr et al., 2022, chapter 7).

Two matrices carry most of the weight here. The likelihood matrix, often written A, encodes P(o | s). The transition matrix, often written B, encodes P(s' | s, u). A separate vector C encodes prior preferences over observations (what the organism prefers to see), and a vector D encodes the initial-state prior. In our workbench these are literal Elixir data structures the agent reads on every tick.

Discrete-time POMDP versus continuous-time formulation

The POMDP form above is one of two dominant framings. The other is continuous-time, where the generative model is written as a stochastic differential equation and free-energy minimization becomes predictive coding with prediction-error units flowing up a hierarchy. Class E (Parr et al., 2022, chapter 8). The tradeoff is honest: discrete-time POMDPs are easier to inspect and map cleanly onto step-based simulations like ours; continuous-time is closer to how the math is usually written for neural populations. UNI's five browser labs and the Cell Lab workbench are all discrete-time. That is a scope choice, not a claim that continuous formulations are wrong.

How UNI's Elixir workbench encodes a generative model

Concretely, our workbench holds a generative model as a struct %GenModel{a, b, c, d}: a is the likelihood, b is the per-action transition tensor, c is prior preferences over observations, and d is the initial-state prior. Every tick, the agent reads an observation, updates its posterior through a, rolls the posterior forward under each candidate policy using b, scores rollouts by expected free energy against c, and samples an action. Class B (code inspection of the workbench modules driving Precision, Echo, Loop, Heart, and Cell labs).

What our current model can do, honestly stated:

Run this loop headlessly across many seeds on hidden state spaces up to a few hundred states (Cell Lab uses 216). Class B
Expose sensory precision, transition precision, and policy temperature as steerable dials that produce distinct behavioral regimes. Class B
Drive the same math from a browser tab and from a public MCP server, so an external language model can call the lab as a tool. Class C

What our current model cannot do, equally honestly:

Deep hierarchical inference over long time horizons in a single agent (we stay shallow on purpose while the benchmark grows).
Continuous-time predictive-coding dynamics (out of scope for the current labs).
Learning the structure of a and b from scratch. Structure learning is on the roadmap; the current labs use hand-specified structure with learnable parameters.

Plain-English translation for engineers

If you build distributed systems, a generative model is the schema plus the physics of your service. Schema is the state space; physics is how state transitions when you push a button; likelihood is your telemetry model. When alarms fire, you are already doing Bayesian inversion by hand, backing out from observed metrics to a probable cause. Active inference is that inversion written down as math and run on every tick, plus a policy step that picks the action minimizing the gap between what you expect and what you want. Same reflex, formalized.

Themesis published a resource map that includes SWU among the five pathways it lists for learning active inference. We link the map as reference, and note that Themesis also runs a separate Python-based course that is complementary to the Elixir stack we use in these labs. See Where to Start with Active Inference, A Resource Map for 2026 and Building Active Inference in Python (Themesis).

A small worked example

Take a two-state world. State is ok or degraded; observations are green or red. Likelihood: ok emits green with probability 0.9; degraded emits red with probability 0.8. Prior: P(ok) = 0.7. You observe red. By Bayes, P(degraded | red) = (0.8 * 0.3) / (0.8 * 0.3 + 0.1 * 0.7) = 0.774. Your belief moved from 30 percent degraded to about 77 percent on one red observation. Repeat every tick, add a transition matrix and policies, and you have Loop Lab. Add spatial state and wall sensors and you have Precision. Add echolocation and you have Echo. Same skeleton throughout.

Falsifier. If observations arrive that cannot be reconstructed as P(o | s) for any reasonable s in our model, the model is wrong and must be revised. Concretely, this triggers when: (a) two policies that our expected-free-energy calculation ranks as clearly separate produce indistinguishable RecoveryScore distributions on the Cell Lab benchmark across seeds; (b) the precision-as-upstream-variable claim from the Loop Lab fails to hold when we widen the state space; or (c) an external replication with an independent Elixir or Python implementation returns significantly different results on the pre-registered disturbance families. Any of these forces a documented revision, published as an evidence-class F entry in the running errata. Class F

Where to go next

Active Inference Fundamentals, A Working Map ›

The one-page map: free energy, the Markov blanket, and how the labs fit together.

Hidden States vs Observations, A Builder's View ›

The split from the engineering side: aliasing, precision, and why the wall you touch is not the cell you are in.

Prior Preferences and Goal-Directed Behavior ›

How the C vector turns "what I want to see" into policy selection.

The Benchmark, What Stratified Palimpsest Actually Tests ›

The pre-registered falsification benchmark, its claims, and its losses.

For the full receipts, see transparency: the preprint, the pre-registered benchmark, and the running errata.