Bellman Foundations¶
DDSL Phase 1.1 — Deck 1/2¶
Convening Meeting — January 2026
Motivation: DP does not have one canonical operator ordering¶
A dynamic program can be written as a composition of operators: - expectation / integration - maximization / decision - push-forward / simulation - approximation, etc.
But perches are not “endpoints of operators.”
The framing we want¶
Previous framing (wrong):
Perches are endpoints of operators (expectation, maximization, …)
Correct framing:
Perches are information sets — σ-algebras (filtrations) describing what is known at each point.
Perches exist to answer one question:
What can the actor condition their policy on?
The general setting: partially observable MDPs (POMDPs)¶
In a POMDP, the actor: - has a true state \(s \in S\) (possibly hidden) - receives observations \(o = O(s,\eta)\) - must maintain a belief \(b(s)\) over the hidden state - chooses actions based on observation history
Problem: optimal policies become maps \(\mathcal{P}(S) \to A\). (Scope pointer: see Information Structure Restriction.)
Our restriction: observable sufficient statistics¶
We restrict attention to problems where:
The information available at decision time is a sufficient statistic for continuation.
So we avoid full belief-state POMDPs while covering most economic models. (Scope pointer: see Information Structure Restriction.)
The three perches (three filtrations)¶
Three filtrations; everything else is bookkeeping.
Perch tags mean “adapted to the filtration”¶
In DDSL-SYM, a perch index is a measurability claim:
$z[<]$means$z$is \(\mathcal{F}_{\text{arvl}}\)-measurable$z$(unmarked) means$z$is \(\mathcal{F}_{\text{dcsn}}\)-measurable$z[>]$means$z$is \(\mathcal{F}_{\text{cntn}}\)-measurable
So every transition / equation must be written so that the mapping exists: the RHS can only use information available at that perch.
Arrival perch = prior filtration¶
The state before any within-stage observations or decisions.
“What do I know coming in?”
Decision perch = observable filtration¶
All information used by the actor to choose.
where \(\zeta\) are shocks/observations revealed before action.
“What can my policy condition on?”
Continuation perch = full filtration¶
The realized outcome after action and all within-stage uncertainty.
where \(\eta\) are shocks revealed after action \(\pi\).
“What is the realized state passed onward?”
The Markov restriction (the tractability constraint)¶
The continuation state depends only on:
Equivalently:
This prevents a drift back to full POMDP belief tracking. (Scope pointer: see Information Structure Restriction.)
Two timing patterns (both fit the same three perches)¶
Pattern A (observed shock before action):
Pattern B (shock after action):
Expectation placement follows timing¶
If shocks are observed before action, expectations naturally live “on the left”:
If shocks are unobserved at action time, expectations naturally live “inside the choice”:
Perches don’t proliferate¶
Between decision and continuation you might compute: - a maximization (choice) - an expectation (unobserved uncertainty) - other operators (inversion, projection, …)
That does not create new perches.
Perches track information, not “how many operators we applied.” (Scope pointer: see Factorizations in Scope.)
What DDSL will ask you to specify (high-level)¶
- The state objects at each perch \(x_{\text{arvl}}, x_{\text{dcsn}}, x_{\text{cntn}}\)
- Shocks and when they are revealed (observed vs unobserved)
- Transitions between perches
- Rewards and discounting
Numerics (grids, interpolation, quadrature) live elsewhere.
A stage is already a small graph (perches = nodes, movers = edges)¶
The stage’s content is an operator factorization: it makes the Bellman structure explicit as composable pieces.
Conjugation (duality): Backward and forward operators are conjugates: \(\langle \mathcal{F} \mu, V \rangle = \langle \mu, \mathcal{B} V \rangle\)
- \(\mathcal{B}_{\text{arvl}}\) (expectation) ↔ \(\mathcal{F}_{\text{arvl}}\) (push-forward)
- \(\mathcal{B}_{\text{dcsn}}\) (Bellman max) ↔ \(\mathcal{F}_{\text{dcsn}}\) (policy push-forward)
Syntax specifies backward (problem); forward (simulation) derived via conjugation. (Scope pointer: see Problem Chunks.)
A model is a graph (not a sequence) of stages¶
In the simplest case you have a chain, but branching and reuse are natural:
Edges are connectors: formal maps that wire one stage’s continuation objects into another stage’s arrival objects (renaming, projections, “twisters”, etc.). (Scope pointer: see Problem Chunks.)
Graph view ↔ category view (same idea, more structure)¶
- graph: stages as nodes, connectors as arrows
- category: stages as objects; connectors as morphisms; wiring is composition
(identities correspond to “no-op” connectors)
This is the right abstraction when models are not linear time-indexed scripts.
The expressive job of DDSL (and what it is not)¶
- DDSL is for: representing Bellman operators (stages) and connecting them formally (connectors / composition)
- DDSL is not: “start with a sequence of equations over time” as the primary organizing principle
Sequence is a special case of a stage graph.
Next: DDSL foundations (deck 2/2)¶
Open: AI/working/12012026/ddsl_foundations.md
Topics: - SYM vs CORE - Υ / ρ meaning maps - methodization + calibration + settings - worked examples (top-down)