Every generative model an active-inference agent carries has two halves that engineers routinely confuse. There is what the agent gets, the observations. There is what the agent has to guess at, the hidden states. Getting the split right is the difference between a controller that thinks and a lookup table that guesses.
The distinction, stated plainly
In an active-inference POMDP the generative model factors into P(o, s, u): a likelihood P(o | s), a transition P(s' | s, u), and prior preferences P(o) over what the agent expects to observe when things are going well (Class E, after Parr, Pezzulo and Friston, 2022). The observation o is what the sensor delivers on this tick. The hidden state s is the compact latent the observation is a noisy, partial, sometimes aliased view of.
Two lines that engineers should tape to the wall. First: the agent never sees s. It only ever sees o, and infers a posterior belief Q(s) over the hidden state by Bayesian inference against its own generative model (Class E). Second: the boundary between o and s is a modeling choice you make when you build the agent. Move the boundary and you get a different agent (Class C, from how our labs are wired).
Q(s) and P(s | o) small, in nats, not joules.Why the split is load-bearing
Three consequences fall out immediately once you take the observation, hidden state split seriously.
- Precision is a knob on the likelihood, not on the world. Sensory precision is how much the agent trusts its own
P(o | s). Turn it up and the posterior collapses toward whatever the observation implies. Turn it down and the prior dominates. The Precision Lab lets you move that knob and watch behavior change (Class C). - Aliasing is a design signal, not a bug. When two hidden states produce indistinguishable observations, the agent is right to be uncertain. A controller that hides that uncertainty behind a confident choice is lying to its operator. Surfacing the aliasing is the honest move.
- Goal-directed behavior lives in the observation space, not the hidden-state space. Prior preferences
P(o)say what the agent wants to observe. That is where you encode intent. Trying to encode intent directly over hidden states is a common builder mistake and it degrades the KL-divergence term the whole framework is trying to minimize (Class E, per Parr et al.).
A small conceptual example, in the workbench idiom
Take a service cell running on the UNI workbench. The observation vector o is what your telemetry actually emits: request rate, p95 latency, error rate, a saturation signal, a health probe. The hidden state s is what those signals are shadows of: the underlying regime (normal, saturating, degrading upstream dependency, bad deploy, memory pressure).
The agent never gets to read the regime label off the box. It gets the five-number observation and computes Q(s). When p95 climbs and the probe still passes, two hypotheses stay alive in the posterior: saturate and dep_bad. A precision-honest controller widens the belief instead of picking. Then it takes an epistemic action, a drain step that is cheap and reversible, whose expected observation will separate the two hypotheses. That is the standard EFE decomposition: exploration when uncertainty is high, commitment when the posterior sharpens (Class E).
Contrast this with a controller that treats the observation as the state. It sees rising p95 and rolls back the last deploy. It was a noisy neighbor. The rollback did nothing, or made it worse. The observation-as-state controller has no way to represent "I do not know yet, and here is the cheapest experiment to find out."
How this shows up in the labs
Every lab on this site enforces the split at the code level (Class C). In the Precision Lab the agent's wall sensor is o; the maze cell it occupies is s. In the Echo Lab the same maze is s but o is a range-2 echo return, which aliases badly at corners and forces a very different posterior shape. In the Cell Lab the 216-state hidden regime is s and the five-number telemetry vector is o. The disturbance families in the pre-registered benchmark are designed so that some are trivially observable and some are structurally aliased, exactly to test whether the observation, hidden state split is doing real work.
Builder checklist
- Write down
o,s, andubefore you write any code. If you cannot separate them on paper, your code will not separate them either. - Never let a component of
sleak intoo. If your agent's "sensor" is reading a field the environment computed from the hidden regime, you have built a lookup table with extra steps. - Put intent in
P(o), not ins. Encode "I want to observe latency inside the viable band," not "I want to be in the normal regime." - Report the posterior
Q(s), not just the argmax. Operators can act on "70 percent saturate, 25 percent dep_bad, 5 percent everything else." They cannot audit a single label.
What this does not yet prove
The split above is a modeling discipline with an empirical payoff in our published benchmark runs, and a theoretical grounding in the active-inference literature (Class E). It is not a proof that any specific hidden-state factorization is the correct one for your system. Pick the factorization, publish it, then let the falsification runs try to break it. That is what the transparency page is for.