KL divergence and Bayesian inference

Conjugate priors: when they help, when they hide work.

Closed-form Bayesian updates look like magic the first time you meet them. You count, you add, you normalize, and out drops a posterior. The magic is not free. It is a modeling contract about the shape of your uncertainty, and when the contract matches your problem the algebra sings. When it does not, the closed form quietly hides the work you still need to do.

The contract, in one paragraph

A prior is called conjugate to a likelihood when the posterior lives in the same parametric family as the prior. The Beta is conjugate to the Bernoulli. The Dirichlet is conjugate to the categorical and the multinomial. The Gamma is conjugate to the Poisson rate. The Normal with known variance is conjugate to itself. Parr, Pezzulo and Friston set this out cleanly in their treatment of the generative model, then use Dirichlet priors over the categorical parameters of active-inference POMDPs so that learning the A, B, and D matrices reduces to incrementing counts Class E (Parr, Pezzulo, Friston, Active Inference, MIT Press 2022).

Why the algebra pays off

Three concrete wins, all Class C consequences of the same closed-form update Class C:

Where the closed form hides work

The contract is that your uncertainty is Dirichlet-shaped, or Beta-shaped, or Gaussian-shaped. If the world hands you uncertainty of a different shape, the closed form does not warn you. It just gives you a wrong posterior with the right units.

A working test

Before accepting a conjugate prior, ask three questions. Does the family admit the shape of belief you actually hold? Does the sufficient-statistic story match how observations arrive? Would a mixture, a hierarchical model, or a non-parametric prior express something the closed form cannot? If any answer is no, the closed form is fine as a first pass and a liability as a final answer Class C.

How UNI uses this in practice

In the UNI POMDP labs the categorical parameters of the generative model carry Dirichlet priors, and learning is incremental count updates against observed outcomes Class C. That choice is deliberate: the agent's belief about state transitions and observation likelihoods is genuinely categorical, the Dirichlet shape is a fair match, and the closed form keeps the labs runnable in a browser tab with no backend. The Cell Lab benchmark shows the resulting controller does well in most disturbance families and loses in three, which is exactly the kind of loss you would expect a unimodal count-based posterior to take when the world briefly changes regime. The failure is legible because the prior is legible. That is the payoff of the contract, not a workaround for it.

For the same reason we do not reach for conjugacy inside the higher-level planning objective, where preferences over outcome distributions are more naturally written as targets in the same simplex the model already lives in, and where KL divergences enter the expected free energy directly. UNI is a working hypothesis on an attainable path toward General Natural Intelligence: a natural, active-inference approach whose evidence is growing, evidence-classed, and tested in the open. Do not take the claim on faith. Test the build, inspect the gates, and help us find where it fails.

The Cell Lab benchmark, the preprint at DOI 10.5281/zenodo.19785799, and the five browser labs on this site all sit on Class B and Class C evidence. The behavioral claims are candidate computational phenotypes, hypotheses, not diagnoses. Nothing here is clinical guidance.
Priors and likelihoods, in plain language ›
The two ingredients of every Bayesian update, without the equations, so the algebra above stops feeling like magic.
KL divergence in active inference ›
Where the closed-form KL between two Dirichlets earns its keep inside the expected free energy.
Variational inference, a conceptual walkthrough ›
What you reach for when conjugacy runs out and you need an approximate posterior anyway.
The science page ›
The preprint, the five labs, the pre-registered Cell Lab benchmark, and the public MCP endpoint any LLM can call.