Skip to content

5. Bellman Theory Integration

This section establishes how DDSL models connect to the mathematical theory of dynamic programming through movers as the central functional operators, the ADC (Arrival-Decision-Continuation) framework, and the composition of registry operators.

5.1 Movers as the Central Concept

The Primacy of Movers

In the DDSL, movers are the primary functional operators that implement the Bellman decomposition:

Definition (Mover): A mover is a syntactic object that represents a functional operator transforming functions or distributions between perches. Each mover: - Has a mathematical meaning under Υ (an operator on function/distribution spaces) - Composes built-in registry operators (E, argmax, ~, APPROX) - Has an approximate numerical counterpart once methodized and calibrated

Movers are what the methodization functor acts upon to produce numerical algorithms.

Mover Categories

Backward movers implement value function updates (Bellman-type operations): - Transform continuation values into current-period values - Implement expectation over shocks, optimization over actions - Flow: continuation → decision → arrival

Forward movers implement distribution evolution (simulation/push-forward): - Transform distributions through state transitions and policies - Implement sampling, policy application, state evolution - Flow: arrival → decision → continuation

The Factored Bellman Operator

The Bellman operator is expressed as composition of movers:

\[ \mathbb{T} = \mathbb{I} \circ \mathbb{B} \]

where: - \(\mathbb{B}\): Decision mover (optimization + continuation linkage) - \(\mathbb{I}\): Arrival mover (expectation over shocks)

Similarly, the conjugate (forward) operator:

\[ \mathbb{T}^* = \mathrm{F}_{\succ} \circ \mathrm{F}_{\prec} \]

5.2 The ADC Factorization

Perches as information sets (the correct framing)

It is tempting to think of perches as “endpoints of operators” (expectation, maximization, etc.). That framing is misleading.

Correct framing: perches represent information sets\(\sigma\)-algebras (a filtration) describing what is known at each point within a stage. Perches exist to answer one question:

What can the actor condition their policy on?

Formally, within a stage we have a filtration

\[ \mathcal{F}_{\prec} \subseteq \mathcal{F}_{\sim} \subseteq \mathcal{F}_{\succ}, \qquad \mathcal{F}_{m} := \sigma(x_{m}), \]

where \(x_m\) is the (sufficient) information state at perch \(m\).

This framing is compatible with (and motivated by) POMDPs: in the fully general case, policies depend on observation history (or beliefs). We restrict attention to the common economic case where the decision-time information state is a sufficient statistic for continuation (a Markov restriction), so we can work with finite-dimensional \(x_m\) rather than belief distributions.

Perch tags are measurability claims

In DDSL-SYM/dolo-plus stage notation, a perch tag is a measurability/adaptedness claim:

  • \(z[<]\) means \(z\) is \(\mathcal{F}_{\prec}\)-measurable
  • \(z\) (unmarked) means \(z\) is \(\mathcal{F}_{\sim}\)-measurable
  • \(z[>]\) means \(z\) is \(\mathcal{F}_{\succ}\)-measurable

So every equation must be written so that the mapping exists: the RHS can only use information available at that perch.

This also clarifies where expectation/maximization “live”: they are operations inside movers, and different operator orderings (e.g. moving an expectation inside a maximization) do not change the meaning of perches themselves.

Three-Perch Structure

The ADC framework decomposes each period into three perches:

  1. Arrival (\(\prec\)): State at the beginning of the period, before shock realization
  2. Decision (\(\sim\)): State after shock realization, where optimization occurs
  3. Continuation (\(\succ\)): State after decision, linking to the next period

Each perch has its own state space: - \(X_{\prec}\) - arrival state space - \(X_{\sim}\) - decision state space - \(X_{\succ}\) - continuation state space

Perch-to-Perch Movers

The ADC structure defines canonical mover directions:

Backward (value iteration):
    Ve (continuation) ──B_dcsn──> Vv (decision) ──B_arvl──> Va (arrival)

Forward (simulation):
    D_arvl (arrival) ──F_arvl──> D_dcsn (decision) ──F_dcsn──> D_cntn (continuation)

In YAML:

movers:
    # Backward movers
    "cntn_to_dcsn": B_dcsn: [Xe -> R] -> [Xv -> R, Xv -> A]
    "dcsn_to_arvl": B_arvl: [Xv -> R] -> [Xa -> R]

    # Forward movers
    "arvl_to_dcsn": F_arvl: Dist(Xa) -> Dist(Xv)
    "dcsn_to_cntn": F_dcsn: Dist(Xv) -> Dist(Xe)

5.3 Built-in Operators Inside Movers

Each mover composes registry operators to implement its transformation. The operator registry (§3.5) provides the building blocks.

Note on notation: the @methods: {...} lines shown inside mover bodies below are a compact way to display the result of methodization (a “methodized-stage view”). In the canonical three-file architecture, the authored stage.yml contains no @methods; method choices live in methodization.yml.

The Decision Mover (\(\mathbb{B}\))

The decision mover implements the Bellman maximization:

B_dcsn: Ve -> [Vv, astar] @via |
    astar(xv) = argmax_{a in Gamma(xv)} Q_kernel(xv, a)
    Vv(xv) = Q_kernel(xv, astar(xv))
    @methods: {argmax: optim_a, Vv: interp_Vv}

Registry operators used: - argmax_{a ∈ Γ(x)}: Optimization operator (finds optimal action) - APPROX: Approximation operator (attached by methodization to produce Vv as interpolant)

Mathematical meaning under Υ: $$ (\mathbb{B}\, V_{\succ})(x_{\sim}) = \max_{a \in \Gamma(x_{\sim})} Q(x_{\sim}, a) $$ $$ a^*(x_{\sim}) = \arg\max_{a \in \Gamma(x_{\sim})} Q(x_{\sim}, a) $$

where \(Q(x_{\sim}, a) = u(a) + \beta V_{\succ}(\mathrm{g}_{\sim\succ}(x_{\sim}, a))\).

The Arrival Mover (\(\mathbb{I}\))

The arrival mover implements the conditional expectation:

B_arvl: Vv -> Va @via |
    xv = g_av(xa, w)
    Va(xa) = E_w[Vv(xv)]
    @methods: {E_w: expect_w, Va: interp_Va}

Registry operators used: - E_w: Expectation operator (integrates over shock distribution) - APPROX: Approximation operator (produces Va as interpolant)

Mathematical meaning under Υ: $$ (\mathbb{I}\, V_{\sim})(x_{\prec}) = \mathbb{E}w[V}(\mathrm{g{\prec\sim}(x, w))] $$

This is a conditional expectation: we condition on the arrival state \(x_{\prec}\) and integrate over the shock \(w\).

Forward Mover: Arrival to Decision (\(\mathrm{F}_{\prec}\))

F_arvl: D_arvl -> D_dcsn @via |
    xa_tilde ~ D_arvl
    w ~ D_w
    xv_tilde = g_av(xa_tilde, w)
    @methods: {w: simulate_w}

Registry operators used: - ~ (draw): Sampling operator (draws from distributions) - PushForward: Push-forward induced by the transition kernel (often implicit in SYM; realized via sampling/quadrature under ρ)

Kernel vs. method: the transition kernel g_av is a pure function object. The methodization-relevant pieces for forward evolution are the draw / push-forward operators (e.g., simulation scheme, RNG settings), not the kernel definition itself.

Mathematical meaning under Υ: $$ (\mathrm{F}{\prec}\, \mu}) = (\mathrm{g{\prec\sim})* (\mu_{\prec} \otimes \mu_w) $$

The pushforward of the product measure through the transition kernel.

Forward Mover: Decision to Continuation (\(\mathrm{F}_{\succ}\))

F_dcsn: D_dcsn -> D_cntn @via |
    xv_tilde ~ D_dcsn
    a_tilde = astar(xv_tilde)
    xe_tilde = g_ve(xv_tilde, a_tilde)

Registry operators used: - ~ (draw): Sampling from decision distribution - Policy evaluation: Apply converged policy function - Pushforward through transition

Mathematical meaning under Υ: $$ (\mathrm{F}{\succ}\, \mu}) = (\mathrm{g{\sim\succ}(\cdot, a^(\cdot)))_ \mu $$

5.4 Transition Kernels

The ADC framework uses specific transition kernels between perches:

equations:
    # Arrival to decision (with shock)
    g_av: [xa, w] @mapsto xv @via |
        xv = xa * r + w

    # Decision to continuation (with action)
    g_ve: [xv, a] @mapsto xe @via |
        xe = xv - a

These kernels define the state evolution: - \(x_{\sim} = \mathrm{g}_{\prec\sim}(x_{\prec}, w)\): Shock realization moves state from arrival to decision - \(x_{\succ} = \mathrm{g}_{\sim\succ}(x_{\sim}, a)\): Action moves state from decision to continuation - \(x_{\prec,+1} = g_{ea}(x_{\succ})\): Time evolution to next period (often identity)

Method note: transition kernels are Υ-level primitives. They only become method-relevant when you evaluate movers under ρ and the mover execution invokes methodized operators (expectation, optimization/rootfinding, approximation/interpolation, draws/push-forwards, etc.).

5.5 Value Functions and Policy Functions

Value Function Decomposition

The ADC framework decomposes the value function into perch-specific components:

  • \(V_{\prec}: X_{\prec} \to \mathbb{R}\) — value at arrival
  • \(V_{\sim}: X_{\sim} \to \mathbb{R}\) — value at decision (after observing shock)
  • \(V_{\succ}: X_{\succ} \to \mathbb{R}\) — continuation value

These satisfy the system: $$ \begin{align} V_{\succ}(x_{\succ}) &= V_{\prec,+1}(g_{ea}(x_{\succ})) \ V_{\sim}(x_{\sim}) &= \max_{a \in \Gamma(x_{\sim})} \left{ u(a) + \beta V_{\succ}(\mathrm{g}{\sim\succ}(x, a)) \right} \ V_{\prec}(x_{\prec}) &= \mathbb{E}w[V}(\mathrm{g{\prec\sim}(x, w))] \end{align} $$

Policy Functions

The optimization at the decision perch yields a policy function:

\[ a^*(x_{\sim}) = \arg\max_{a \in \Gamma(x_{\sim})} \left\{ u(a) + \beta V_{\succ}(\mathrm{g}_{\sim\succ}(x_{\sim}, a)) \right\} \]

This policy function is: - Computed by the decision mover (\(\mathbb{B}\)) - Stored/approximated according to its method schema - Used by forward movers for simulation

5.6 Forward-Backward Duality

Conjugate Operators

Each backward mover has a conjugate forward mover:

Backward Mover Conjugate Forward Mover Duality
\(\mathbb{I}\): \([X_{\sim} \to \mathbb{R}] \to [X_{\prec} \to \mathbb{R}]\) \(\mathrm{F}_{\prec}\): \(\mathcal{D}(X_{\prec}) \to \mathcal{D}(X_{\sim})\) Expectation ↔ Pushforward
\(\mathbb{B}\): \([X_{\succ} \to \mathbb{R}] \to [X_{\sim} \to \mathbb{R}]\) \(\mathrm{F}_{\succ}\): \(\mathcal{D}(X_{\sim}) \to \mathcal{D}(X_{\succ})\) Bellman ↔ Policy pushforward

Duality Relation

The forward and backward operators satisfy:

\[ \langle \mathbb{T}^* \mu, V \rangle = \langle \mu, \mathbb{T} V \rangle \]

where \(\langle \cdot, \cdot \rangle\) denotes integration: \(\langle \mu, f \rangle = \int f \, d\mu\).

5.7 Computational Flow

Backward Pass (Value Function Iteration)

Ve (given/guess) 
    ↓ B_dcsn (argmax, APPROX)
[Vv, astar] 
    ↓ B_arvl (E_w, APPROX)
Va 
    ↓ link to next period
Ve (updated)

Algorithm: 1. Start with initial guess \(V_{\succ}^0\) 2. Apply decision mover: \((V_{\sim}^n, a^{*,n}) = \mathbb{B}\, V_{\succ}^n\) 3. Apply arrival mover: \(V_{\prec}^n = \mathbb{I}\, V_{\sim}^n\) 4. Update continuation: \(V_{\succ}^{n+1} = V_{\prec}^n\) (or appropriate linkage) 5. Check convergence: \(\|V_{\succ}^{n+1} - V_{\succ}^n\| < \epsilon\)

Forward Pass (Simulation)

D_arvl (given/initial)
    ↓ F_arvl (~, pushforward)
D_dcsn
    ↓ F_dcsn (policy, pushforward)
D_cntn
    ↓ link to next period
D_arvl (next period)

Algorithm: 1. Start with initial distribution \(\mu_{\prec}^0\) 2. Apply arrival forward: \(\mu_{\sim}^t = \mathrm{F}_{\prec}\, \mu_{\prec}^t\) 3. Apply decision forward: \(\mu_{\succ}^t = \mathrm{F}_{\succ}\, \mu_{\sim}^t\) 4. Evolve to next period: \(\mu_{\prec}^{t+1} = \mathrm{F}_{\text{link}}\, \mu_{\succ}^t\)

5.8 Convergence in the Limit

Mathematical Contraction

Under standard conditions (boundedness, discounting, continuity), the factorized Bellman operator is a contraction:

\[ \|\mathbb{T}V - \mathbb{T}V'\| \leq \beta \|V - V'\| \]

This guarantees: 1. Existence: A unique fixed point \(V^*\) exists 2. Uniqueness: The value function is uniquely determined 3. Convergence: Value iteration converges geometrically

Approximate Operators and Limit Agreement

The methodized and calibrated mover under ρ is an approximate operator:

\[ \mathbb{T}_{\text{num}} = \rho(\mathcal{M}(\mathbb{T}), \mathbb{C}) \]

This approximate operator: - Acts on discrete primitive objects (grid values, coefficient arrays) - Is not equal to the mathematical operator \(\Upsilon(\mathbb{T})\) - Converges to the true operator as discretization improves

Limit Agreement Theorem (informal): Under appropriate assumptions on the numerical schemes:

\[ \lim_{\substack{n \to \infty \\ \text{nodes} \to \infty \\ \text{tol} \to 0}} \mathbb{T}_{\text{num}}^{(n)} = \Upsilon(\mathbb{T}) \]

where the convergence is in an appropriate operator norm.

Error Decomposition

The total error in the numerical solution decomposes as:

\[ \|V^*_{\text{num}} - V^*\| \leq \underbrace{\epsilon_{\text{interp}}}_{\text{interpolation}} + \underbrace{\epsilon_{\text{quad}}}_{\text{quadrature}} + \underbrace{\epsilon_{\text{optim}}}_{\text{optimization}} + \underbrace{\epsilon_{\text{iter}}}_{\text{iteration}} \]

Each term corresponds to a registry operator and its method: - \(\epsilon_{\text{interp}}\): From APPROX with interpolation scheme - \(\epsilon_{\text{quad}}\): From E_w with quadrature scheme - \(\epsilon_{\text{optim}}\): From argmax with optimization scheme - \(\epsilon_{\text{iter}}\): From finite iterations

5.9 Buffer-Stock Example: Complete Mover Structure

Stage Definition

stage:
  name: consind
  model_type: adc_stage
  inputs: [Ve, D_arvl]
  outputs: [Va, astar, D_dcsn, D_cntn]

Backward Movers

Decision mover (cntn_to_dcsn):

B_dcsn: Ve -> [Vv, astar] @via |
    astar(xv) = argmax_{a in Gamma(xv)} Q_kernel(xv, a)
    Vv(xv) = Q_kernel(xv, astar(xv))
    @methods: {argmax: optim_a, Vv: interp_Vv}

Mathematical meaning: - Uses argmax from registry to find \(a^*(x_{\sim})\) - Uses APPROX to store \(V_{\sim}\) as interpolant - \(Q(x_{\sim}, a) = u(a) + \beta V_{\succ}(\mathrm{g}_{\sim\succ}(x_{\sim}, a))\)

Arrival mover (dcsn_to_arvl):

B_arvl: Vv -> Va @via |
    xv = g_av(xa, w)
    Va(xa) = E_w[Vv(xv)]
    @methods: {E_w: expect_w, Va: interp_Va}

Mathematical meaning: - Uses E_w from registry for conditional expectation - Integrates over shock \(w\) conditioned on arrival state \(x_{\prec}\) - Uses APPROX to store \(V_{\prec}\) as interpolant

Forward Movers

Arrival forward (arvl_to_dcsn):

F_arvl: D_arvl -> D_dcsn @via |
    xa_tilde ~ D_arvl
    w ~ D_w
    xv_tilde = g_av(xa_tilde, w)
    @methods: {w: simulate_w}

Decision forward (dcsn_to_cntn):

F_dcsn: D_dcsn -> D_cntn @via |
    xv_tilde ~ D_dcsn
    a_tilde = astar(xv_tilde)
    xe_tilde = g_ve(xv_tilde, a_tilde)

Factored Bellman Operator

The complete Bellman operator is:

\[ \mathbb{T} = \mathbb{I} \circ \mathbb{B} \]

which under Υ corresponds to:

\[ (\mathbb{T}V_{\succ})(x_{\prec}) = \mathbb{E}_w\left[\max_{a \in \Gamma(\mathrm{g}_{\prec\sim}(x_{\prec},w))} \left\{ u(a) + \beta V_{\succ}(\mathrm{g}_{\sim\succ}(\mathrm{g}_{\prec\sim}(x_{\prec},w), a)) \right\}\right] \]

5.10 Summary

The Bellman theory integration in DDSL provides:

  1. Movers as central objects: Functional operators that implement the Bellman decomposition
  2. Registry operator composition: Each mover uses E, argmax, ~, APPROX from the operator registry
  3. ADC factorization: Systematic decomposition into arrival, decision, continuation
  4. Forward-backward duality: Conjugate operators for value iteration and simulation
  5. Convergence guarantees: Mathematical operators as limits of approximate numerical operators
  6. Explicit method attachment: Registry operators get schemes through methodization

Key insight:

A syntactic mover maps to a mathematical operator under Υ. A methodized mover maps to an approximate operator under ρ. The two agree only in the limit as discretization parameters go to their ideal values.

This ensures: - Theoretical grounding: Based on established dynamic programming theory - Computational clarity: Explicit decomposition into implementable pieces - Numerical control: Error bounds through method selection - Modularity: Movers can be composed and reused