8. SYM / T core Two-Layer Syntax Architecture¶

Version: 0.1.0
Date: December 22, 2025
Status: Design Note

8.1 Motivation¶

The BE-DDSL foundations (§1–§7) define a rigorous operator-theoretic framework based on Backus' Functional Programming principles. This framework—with its explicit typing, operator registry, methodization functor, and representation maps—provides the formal guarantees we need for correctness and composability.

However, this operator-theoretic syntax may be unfamiliar or intimidating to applied economists, who are accustomed to more "intuitive" model specification styles (e.g., current Dolo YAML, Dynare, HARK).

Moreover, we have a backwards compatibility constraint: existing Dolo models must continue to work unchanged.

This section introduces a two-layer syntax architecture that resolves this tension:

SYM ("Surface/Symbolic"): An economist-friendly surface syntax
T core ("Core/Operator-theoretic"): The rigorous internal representation

Terminology: we now call the internal layer T core (older docs: “CORE” / “DDSL-CORE”).

8.2 The Two Layers¶

8.2.1 SYM (Surface Layer)¶

SYM is the user-facing syntax layer. It is designed to:

Feel natural to applied economists
Uses familiar notation and conventions
Minimizes explicit type annotations where they can be inferred
Allows "implicit" operator usage (e.g., E[...] without declaring operator instances)
Maintain backwards compatibility with Dolo
Existing Dolo models parse without modification
Current symbols:, equations:, calibration: blocks work as before
DDSL extensions are purely additive
Preserve the spirit of current syntax
Declarative, not imperative
Minimal boilerplate
Economist-readable at a glance

Example (SYM style):

symbols:
    states: [m]
    controls: [c]
    parameters: [β, γ, r]

equations:
    transition: |
        m_next = (m - c) * r + y

    arbitrage: |
        β * (c_next/c)^(-γ) * r = 1

This is essentially current Dolo syntax—no operator registry, no explicit typing, no @mapsto.

8.2.2 T core (Internal Layer)¶

T core is the rigorous internal representation. It is:

Fully operator-theoretic
All operators explicitly declared in operator registry
All symbols explicitly typed with @in, @def, @mapsto
Movers, perches, and ADC structure fully specified
Supports formal semantics
Every construct has a well-defined Υ (mathematical meaning)
Methodization functor operates on explicit operator instances
Type system ensures composition validity
Based on Backus FFP principles
Named function objects (nullary for constants/spaces, higher-order for operators)
No hidden state; single-assignment semantics
Algebraic laws for transformation

Analogy: Lean 4 elaboration ≈ φ (SYM → T core)

The relationship between SYM and T core parallels Lean 4's elaboration pipeline. In Lean, elaboration is the process that transforms surface-level Syntax (user-facing, sugar-rich) into fully-typed Expr kernel terms (small, trusted). The kernel never sees surface syntax — it only type-checks elaborated terms.

In Bellman calculus the same pattern holds: the φ map (SYM → T core) is the elaboration step. It desugars perch-tag aliases, resolves bare symbols, infers operator instances, and normalizes implicit function typing — producing a fully-typed Bellman-calculus stage IR that downstream machinery (Υ, methodization, Ρ) can trust without re-parsing the surface YAML.

Lean 4	Bellman calculus
`Syntax` (surface)	SYM (dolo-plus YAML)
Elaboration (tactics, macros, type inference)	φ (desugaring, perch inference, operator resolution)
`Expr` (kernel term)	T core (normalized, typed stage IR)
Kernel (CIC type-checker)	T core validator (perch measurability, type consistency)

See spec 0.2a (compiler pipeline architecture) for the full Lean-inspired pipeline design, and the Lean 4 source context for the reference material. #elaboration #lean4

Example (T core style):

href="#__codelineno-1-1">symbols: spaces: "arrival_state_space": Xa @def R+ "decision_state_space": Xv @def R+ variables: states: "arrival_state": xa @in Xa "decision_state": xv @in Xv controls: "consumption": c @in R+ parameters: "discount_factor": β @in (0,1) "risk_aversion": γ @in R+ "return_factor": r @in R+ >operators: - name: E_y class: expectation - name: argmax_c class: optimization >perches: arrival: backward: B_arvl: Vv -> Va @via | Va(xa) = E_y[Vv(g_av(xa, y))] decision: backward: B_dcsn: Ve -> [Vv, c_star] @via | c_star(xv) = argmax_{c in Gamma(xv)} Q(xv, c) Vv(xv) = Q(xv, c_star(xv))

8.2.3 Multiple SYM tiers (and the Bellman-calculus contract)¶

In practice, SYM is not a single surface language. It is better thought of as a family of increasingly high-level surface representations, all of which lower to the same internal CORE meaning.

dolo-plus SYM (current): the “schema examples” used throughout the dolo-plus syntax/semantics notes (e.g. Dolo+ syntax & semantics overview) are one concrete SYM tier. They already rely on implicit function definitions (e.g., transitions and perch-local movers are specified via equation blocks rather than explicit @mapsto functions everywhere).
HABLO-style SYM (hypothetical higher tier): Alan’s HABLO-style model syntax appears to allow more implicit structure (e.g., implicit definition of transition functions and value functions from higher-level declarations). Each such step upward requires additional assumptions, additional inference rules, and usually additional parsing conventions.

One concrete example of “dolo-plus SYM sugar” is implicit function typing / implicit evaluation. For example, a stage may contain an equation like:

equations:
  # (illustrative) continuation-perch value evaluation
  value: |
    V[>] = V_cntn_fn[x[>]]

The surface form does not explicitly declare the type of V_cntn_fn (e.g., $V_{\succ}: X_{\succ} \to \mathbb{R}$) nor the evaluation operator; instead, SYM parsing rules infer these from context (perch tags like [>], symbol declarations, and stage structure) and lower this to an explicit, typed CORE form.

This raises a genuine boundary question: what belongs to “SYM parsing rules” versus “CORE parsing rules”, and how do Bellman-calculus parsing rules integrate with each SYM tier?

To keep this crisp, it helps to state a minimal contract.

Bellman-calculus contract (minimal requirement for SYM): a SYM representation is admissible for BE-DDSL if, after applying its implicit rules / desugaring directives, it yields a well-formed Bellman-calculus stage (a normalized, typed stage-level AST with enough explicit structure to interpret under Υ and to support methodization + calibration before applying $\mathbf{\mathrm{Ρ}}$).

Under this view, a HABLO parser can be a distinct SYM front-end even if it “processes to produce a Bellman-calculus model”: it is simply another SYM tier whose output (after normalization) is the same Bellman-calculus stage language that downstream CORE machinery consumes.

8.3 The Bijection Requirement¶

The critical property of the DDSL directive system is:

The mapping from normalized SYM (post-desugaring) to CORE must be a bijection (one-to-one and onto).

This is compatible with having multiple concrete SYM tiers: different surface syntaxes (e.g., Dolo+ SYM vs a HABLO-style SYM) may differ in their front-end parsing and inference rules, but they should lower into the same normalized SYM stage AST before applying the SYM ↔ T core directives.

8.3.1 Why Bijection?¶

Round-tripping: Given a CORE representation, we can always produce the equivalent SYM representation. This is essential for:
Error messages (show users SYM, not CORE)
Model inspection and debugging
Documentation generation
Semantic equivalence: A SYM model and its CORE translation have identical mathematical meaning under Υ: $$\Upsilon(\text{SYM}) = \Upsilon(\phi(\text{SYM})) = \Upsilon(\text{CORE})$$ where $\phi : \text{SYM} \to \text{CORE}$ is the forward translation.
No information loss: The translation preserves all semantic content. Nothing is "added" or "lost" in translation—only the syntactic representation changes.

Practically: we mean semantic information loss. YAML comments, key ordering, and other purely presentational details are not part of the semantic model and are not expected to round-trip.

Invertibility: Given a CORE model, we can recover the "canonical" SYM form: $$\phi^{-1} : \text{CORE} \to \text{SYM}$$

8.3.2 The Directive System¶

The BE-DDSL directives define the bijection formally. Each directive specifies:

SYM pattern: What the surface syntax looks like
CORE pattern: What the internal representation looks like
Translation rules: How to map between them (both directions)
Inference rules: What can be inferred vs. what must be explicit

Example directive (informal):

SYM Pattern	CORE Pattern	Notes
`states: [m]`	`variables.states: "state_m": m @in X_state`	Space inferred from `domain:` block
`E[f(x,y)]`	`E_y[f(x,y)]` where `y` is declared as shock	Operator instance inferred from shock variable
`transition: \\| m' = ...`	`g_av: [xa, y] @mapsto xv @via \\| xv = ...`	Transition becomes explicit transition function

8.3.3 Handling Ambiguity¶

Some SYM constructs may appear ambiguous but must resolve to unique CORE representations. The directive system handles this via:

Contextual inference: Use surrounding declarations to disambiguate
Canonical forms: When multiple CORE representations are semantically equivalent, choose a canonical one
Explicit disambiguation: Require the user to be more explicit if ambiguity cannot be resolved

8.4 Integration with Existing Architecture¶

The two-layer architecture extends the existing Υ/$\mathbf{\mathrm{Ρ}}$ framework:

            ┌───────────────────────────────────────────────────┐
            │                  USER INTERFACE                    │
            │                                                   │
            │   SYM (economist-friendly)                        │
            │   - Backwards-compatible with Dolo                │
            │   - Intuitive, minimal syntax                     │
            └────────────┬──────────────────────┬───────────────┘
                         │                      │
           φ (directives:│                      │ Ρ₀ (?)
           bijection)    │                      │ direct SYM-to-code
                         ▼                      │
            ┌──────────────────────────────┐    │
            │     INTERNAL REPRESENTATION  │    │
            │                              │    │
            │  CORE (operator-theoretic)   │    │
            │  - Explicit typing, op reg.  │    │
            │  - Methodization functor     │    │
            │  - Backus FFP principles     │    │
            └──────┬────────────┬──────────┘    │
                   │            │               │
            Υ (meaning) + Methodization         │
                   │    + Calibration            │
                   ▼            ▼               │
            ┌────────────┐ ┌──────────────┐     │
            │ MATH       │ │ CORE         │     │
            │ MEANING    │ │ (methodized, │     │
            │ (𝔻𝕄)       │ │  calibrated) │     │
            │            │ └──────┬───────┘     │
            │ Bellman    │        │             │
            │ ops,       │        │ Ρ           │
            │ theorems   │        │             │
            └────────────┘        ▼             ▼
            ┌───────────────────────────────────────────────────┐
            │           COMPUTATIONAL REPRESENTATION            │
            │           (𝔽^COMP)                                │
            │                                                   │
            │           AST, Python/Julia code                  │
            └───────────────────────────────────────────────────┘

In the ASCII diagram, Ρ is shorthand for the representation map $\mathbf{\mathrm{Ρ}}$, and Ρ₀ is shorthand for $\mathbf{\mathrm{Ρ}}_0$.

Key Points:¶

Υ applies to CORE: The mathematical meaning map operates on CORE, which has the explicit structure needed for formal interpretation.
$\mathbf{\mathrm{Ρ}}$ requires methodization: The computational representation map applies after methodization and calibration, which also operate on CORE.
SYM is purely syntactic sugar: SYM is a notational convenience; all semantic operations happen at the CORE level.
Users can write in either layer: Power users can write directly in CORE if they prefer explicit control. Most users write in SYM and let the directives handle translation.
Possible direct path $\mathbf{\mathrm{Ρ}}_0$(?): there may also be a direct map from SYM to computational code that skips CORE entirely. The question mark is intentional: it is an open question whether such a shortcut can preserve the semantic guarantees of the canonical SYM → φ → CORE → (methodize+calibrate) → $\mathbf{\mathrm{Ρ}}$ path.

8.5 Implementation Strategy¶

Phase 1: Define Core-Compatible SYM Subset¶

Identify which current Dolo constructs map cleanly to CORE
Document the translation rules
Ensure existing models work unchanged

Phase 2: Implement Forward Translation (φ)¶

Parser produces CORE from SYM input
Validate that translation preserves semantics
Handle inference and disambiguation

Phase 3: Implement Inverse Translation (φ⁻¹)¶

Generate SYM from CORE for error messages, documentation
Define canonical SYM forms for CORE constructs
Ensure round-trip consistency: φ⁻¹(φ(S)) ≈ S (up to canonical form)

Phase 4: Extend SYM Vocabulary¶

Add new SYM constructs for DDSL features (stages, perches, movers)
Maintain backwards compatibility
Document new syntax for users

Status note (implementation reality)¶

As of early 2026, the project already uses a small but important slice of this architecture in practice:

dolo-plus YAML acts as an initial SYM extension for stages/periods/nests.
The doloplus_to_dolo pipeline is an early prototype of the forward translation φ into a solver-ready internal representation (currently, vanilla Dolo models).

This is compatible with the design note: it prioritizes forward translation and semantic fidelity first, while full CORE ↔ SYM round-tripping is deferred.

8.6 Relation to Backwards Compatibility¶

The two-layer architecture directly addresses the backwards compatibility constraint:

Constraint	How Two-Layer Architecture Satisfies It
Existing Dolo models must work	SYM layer accepts current syntax unchanged
No breaking changes	CORE is internal; users never see it unless they opt in
Spirit of current syntax	SYM is designed to feel like natural Dolo evolution
Formal guarantees	CORE provides rigorous semantics, invisible to casual users

8.7 Glossary Additions¶

Term	Definition
SYM	Surface syntax layer; economist-friendly, backwards-compatible
CORE	Internal representation; operator-theoretic, formally rigorous
φ (phi)	Forward translation map: SYM → CORE
φ⁻¹	Inverse translation map: CORE → SYM
Directive	Rule specifying the bijection between SYM and CORE patterns
Canonical form	The standard SYM representation for a given CORE construct

8.8 Key Design Insight: Implicit vs. Explicit Functional Notation¶

The Problem in Current Dolo¶

In Dolo's YAML syntax, when you write:

values: [V]
equations:
    value: |
        V[t] = u(c[t]) + β * V[t+1]

Several things are implicit: 1. V is a function V: (exogenous × states) → R, not just a scalar 2. V[t] means "evaluate V at current state point" 3. V[t+1] means "evaluate V at next-period states" (requires transition + interpolation) 4. The expectation over shocks is implied by the recursive structure 5. The grid/interpolation is specified elsewhere (options:, domain:)

This works well for users who understand the conventions, but the function/variable distinction is never made explicit.

SYM Approach: Keep Implicit Notation, Explicit Assumptions¶

Key insight: SYM does not need to introduce explicit functional notation at the surface level.

Instead of forcing users to write:

# CORE style (too verbose for applied economists)
V: Xa -> R @mapsto v @via |
    v = u(c) + β * E_w[V(g_av(xa, w))]

SYM can preserve familiar Dolo-style syntax:

# SYM style (familiar, minimal)
values: [V]
equations:
    value: |
        V[t] = u(c[t]) + β * V[t+1]

But the implicit assumptions become explicit in the DDSL directive system:

Implicit in Syntax	Explicit in Directives/CORE
`V` is a function	`V: (exogenous × states) → R`
`[t]` means current point	Evaluation at grid point
`[t+1]` means next-period	`E_shock[V(transition(state, shock))]`
Interpolation method	From `options.grid` → APPROX operator
Integration method	From exogenous spec → expectation scheme

Benefits of This Approach¶

Zero learning curve: Existing Dolo users see no changes
Formal semantics available: CORE makes everything explicit
Round-tripping works: SYM → CORE → SYM preserves intent
Power users can opt in: Explicit functional notation available if desired
Better error messages: Errors can reference implicit assumptions

Translation Example¶

SYM (what user writes):

symbols:
    states: [w]
    controls: [c]
    values: [V]
equations:
    value: |
        V[t] = log(c[t]) + β * V[t+1]

CORE (internal representation):

symbols:
    spaces:
        "state_space": X_w @def R+
    variables:
        states:
            "wealth": w @in X_w
        controls:
            "consumption": c @in R+
    functions:
        outputs:
            "value_function": V: X_w -> R
operators:
    - name: E_shock
      class: expectation
    - name: APPROX_V
      class: approximation
perches:
    decision:
        backward:
            B_value: V_next -> V @via |
                V(w) = log(c_star(w)) + β * E_shock[V_next(g(w, c_star(w), shock))]

The directives specify exactly how the SYM constructs map to CORE constructs.

8.9 Open Questions¶

Inference boundaries: How much should SYM infer vs. require explicitly? Trade-off between convenience and potential ambiguity.
Extension syntax: When SYM needs new constructs (e.g., for ADC perches), what should they look like?
Error messages: When translation fails, how do we report errors in terms the user understands (SYM) rather than internal terms (CORE)?
Partial CORE exposure: Should power users be able to mix SYM and CORE syntax in the same file?
Where does Bellman-calculus parsing live?: Is Bellman-calculus a CORE sublanguage with its own parser/typechecker (consumed by multiple SYM front-ends), or is “Bellman-calculus parsing” inseparable from each SYM tier’s parsing rules?
Minimal SYM contract: What is the smallest set of declarations/equations a SYM tier must make explicit so that, after implicit rules are applied, we deterministically obtain a Bellman-calculus stage (suitable for Υ, methodization, calibration, and then $\mathbf{\mathrm{Ρ}}$)?

8.10 Summary¶

The SYM / CORE architecture provides:

User-friendliness: Economists write in familiar, intuitive syntax
Formal rigor: Internal representation supports all theoretical guarantees
Backwards compatibility: Existing Dolo models work without modification
Extensibility: New DDSL features can be added at CORE level and surfaced via SYM
Round-tripping: Bijection ensures no information loss between layers

This architecture is the key to satisfying both the applied-economist usability requirements and the formal-semantics requirements of the DDSL project.