The science, in the open

Active inference you can steer, test, and call.

UNI is one idea taken seriously: minds and organizations get by predicting what is about to happen, sensing what actually happens, and acting to keep the gap small. Here is the math, the labs, the benchmark that tries to prove us wrong, and a server any language model can call.

The free energy review, written in the open.

Unrefereed preprint, expert review pending

An Organic Operator and AI Operator Collaborative Review of Active Inference Free Energy Minimization

Polzin et al. (2026). DOI 10.5281/zenodo.19785799.

A plain-spoken review of active inference and the free energy principle, the idea that a system maintains itself by minimizing variational free energy, an upper bound on surprise, through both perception and action. The labs on this site are the math made steerable. Cite this as a preprint: it is not yet peer reviewed.

Read the preprint on Zenodo ›

Foundations. POMDP formulation after Parr, Pezzulo and Friston (2022), Active Inference: The Free Energy Principle in Mind, Brain, and Behavior, MIT Press. Cell Lab framing after Mikkilineni (2022), DOI 10.3390/info13010024, cited, never hosted.

Five instruments, all running in your browser.

Zero backend, zero dependencies. Move the dials and watch inference change in front of you.

Pre-registered, and built to be proven wrong.

The Cell Lab is an open falsification benchmark. Five claims were written with their falsification criteria before the runs, and every loss is recorded as plainly as every win. UNI beats a random controller in 7 of 7 disturbance families (significant in 6 of 7), the rule-based SRE in 6 of 7, and a neural baseline in 5 of 7. It loses three times, shown below. A single active-inference controller is not universally best, which is exactly the disconfirmation the benchmark is designed to surface.

Disturbance familyUNIrule-basedrandomneuralNotable
traffic_spike0.9690.9600.8800.924UNI wins vs random (sig)
memory_leak0.7400.7310.6760.810neural wins overall
bad_deploy0.9370.8950.6210.704UNI wins (sig)
database_flaky0.7590.8030.6750.694rule-based wins overall
cache_down0.6640.6410.5180.588UNI wins (sig)
cpu_noisy_neighbor0.7490.7400.7030.824neural wins; UNI vs random not significant
observability_loss0.9920.9880.9610.974UNI wins (sig)

RecoveryScore is the fraction of ticks inside the viable set, weighted by excursion depth. Committed cache: depth 2, 6 seeds, 80 ticks. "sig" means a bootstrap 95% confidence interval for the median paired difference excludes 0.

Call the lab directly, over MCP.

The deployment exposes a public, anonymous Model Context Protocol server. Any LLM client can introspect, simulate, and drive the labs. No auth, no token, just the URL.

https://universalnaturalintelligence.com/api/mcp

16 tools, two groups. Headless, running server-side now: list_labs, list_mazes, describe_dial, run_episode, run_sweep, compare_labs. Live, driving a user's open lab tab: attach_session, detach_session, set_dial, switch_maze, set_planning_depth, set_action_mode, step_agent, auto_run, reset, read_state.

Machine-readable index for crawlers and agents: llms.txt and llms-full.txt.

What this is not.

Not a clinical tool, not a diagnostic instrument, not therapy or treatment advice, and not evidence that active inference is the correct theory of mental health. Behavioral labels are candidate computational phenotypes, hypotheses, not diagnoses. The preprint is not yet peer reviewed. We would rather be correct than impressive.

The science, the delivery, the evidence.