8. SYM / T core Two-Layer Syntax Architecture¶
Version: 0.1.0
Date: December 22, 2025
Status: Design Note
8.1 Motivation¶
The BE-DDSL foundations (§1–§7) define a rigorous operator-theoretic framework based on Backus' Functional Programming principles. This framework—with its explicit typing, operator registry, methodization functor, and representation maps—provides the formal guarantees we need for correctness and composability.
However, this operator-theoretic syntax may be unfamiliar or intimidating to applied economists, who are accustomed to more "intuitive" model specification styles (e.g., current Dolo YAML, Dynare, HARK).
Moreover, we have a backwards compatibility constraint: existing Dolo models must continue to work unchanged.
This section introduces a two-layer syntax architecture that resolves this tension:
- SYM ("Surface/Symbolic"): An economist-friendly surface syntax
- T core ("Core/Operator-theoretic"): The rigorous internal representation
Terminology: we now call the internal layer T core (older docs: “CORE” / “DDSL-CORE”).
8.2 The Two Layers¶
8.2.1 SYM (Surface Layer)¶
SYM is the user-facing syntax layer. It is designed to:
- Feel natural to applied economists
- Uses familiar notation and conventions
- Minimizes explicit type annotations where they can be inferred
-
Allows "implicit" operator usage (e.g.,
E[...]without declaring operator instances) -
Maintain backwards compatibility with Dolo
- Existing Dolo models parse without modification
- Current
symbols:,equations:,calibration:blocks work as before -
DDSL extensions are purely additive
-
Preserve the spirit of current syntax
- Declarative, not imperative
- Minimal boilerplate
- Economist-readable at a glance
Example (SYM style):
symbols:
states: [m]
controls: [c]
parameters: [β, γ, r]
equations:
transition: |
m_next = (m - c) * r + y
arbitrage: |
β * (c_next/c)^(-γ) * r = 1
This is essentially current Dolo syntax—no operator registry, no explicit typing, no @mapsto.
8.2.2 T core (Internal Layer)¶
T core is the rigorous internal representation. It is:
- Fully operator-theoretic
- All operators explicitly declared in operator registry
- All symbols explicitly typed with
@in,@def,@mapsto -
Movers, perches, and ADC structure fully specified
-
Supports formal semantics
- Every construct has a well-defined Υ (mathematical meaning)
- Methodization functor operates on explicit operator instances
-
Type system ensures composition validity
-
Based on Backus FFP principles
- Named function objects (nullary for constants/spaces, higher-order for operators)
- No hidden state; single-assignment semantics
- Algebraic laws for transformation
Analogy: Lean 4 elaboration ≈ φ (SYM → T core)
The relationship between SYM and T core parallels Lean 4's elaboration pipeline. In Lean, elaboration is the process that transforms surface-level Syntax (user-facing, sugar-rich) into fully-typed Expr kernel terms (small, trusted). The kernel never sees surface syntax — it only type-checks elaborated terms.
In Bellman calculus the same pattern holds: the φ map (SYM → T core) is the elaboration step. It desugars perch-tag aliases, resolves bare symbols, infers operator instances, and normalizes implicit function typing — producing a fully-typed Bellman-calculus stage IR that downstream machinery (Υ, methodization, Ρ) can trust without re-parsing the surface YAML.
| Lean 4 | Bellman calculus |
|---|---|
Syntax (surface) |
SYM (dolo-plus YAML) |
| Elaboration (tactics, macros, type inference) | φ (desugaring, perch inference, operator resolution) |
Expr (kernel term) |
T core (normalized, typed stage IR) |
| Kernel (CIC type-checker) | T core validator (perch measurability, type consistency) |
See spec 0.2a (compiler pipeline architecture) for the full Lean-inspired pipeline design, and the Lean 4 source context for the reference material. #elaboration #lean4
Example (T core style):
symbols:
spaces:
"arrival_state_space": Xa @def R+
"decision_state_space": Xv @def R+
variables:
states:
"arrival_state": xa @in Xa
"decision_state": xv @in Xv
controls:
"consumption": c @in R+
parameters:
"discount_factor": β @in (0,1)
"risk_aversion": γ @in R+
"return_factor": r @in R+
operators:
- name: E_y
class: expectation
- name: argmax_c
class: optimization
perches:
arrival:
backward:
B_arvl: Vv -> Va @via |
Va(xa) = E_y[Vv(g_av(xa, y))]
decision:
backward:
B_dcsn: Ve -> [Vv, c_star] @via |
c_star(xv) = argmax_{c in Gamma(xv)} Q(xv, c)
Vv(xv) = Q(xv, c_star(xv))
8.2.3 Multiple SYM tiers (and the Bellman-calculus contract)¶
In practice, SYM is not a single surface language. It is better thought of as a family of increasingly high-level surface representations, all of which lower to the same internal CORE meaning.
- dolo-plus SYM (current): the “schema examples” used throughout the dolo-plus syntax/semantics notes (e.g. Dolo+ syntax & semantics overview) are one concrete SYM tier. They already rely on implicit function definitions (e.g., transitions and perch-local movers are specified via equation blocks rather than explicit
@mapstofunctions everywhere). - HABLO-style SYM (hypothetical higher tier): Alan’s HABLO-style model syntax appears to allow more implicit structure (e.g., implicit definition of transition functions and value functions from higher-level declarations). Each such step upward requires additional assumptions, additional inference rules, and usually additional parsing conventions.
One concrete example of “dolo-plus SYM sugar” is implicit function typing / implicit evaluation. For example, a stage may contain an equation like:
The surface form does not explicitly declare the type of V_cntn_fn (e.g., \(V_{\succ}: X_{\succ} \to \mathbb{R}\)) nor the evaluation operator; instead, SYM parsing rules infer these from context (perch tags like [>], symbol declarations, and stage structure) and lower this to an explicit, typed CORE form.
This raises a genuine boundary question: what belongs to “SYM parsing rules” versus “CORE parsing rules”, and how do Bellman-calculus parsing rules integrate with each SYM tier?
To keep this crisp, it helps to state a minimal contract.
Bellman-calculus contract (minimal requirement for SYM): a SYM representation is admissible for BE-DDSL if, after applying its implicit rules / desugaring directives, it yields a well-formed Bellman-calculus stage (a normalized, typed stage-level AST with enough explicit structure to interpret under Υ and to support methodization + calibration before applying \(\mathbf{\mathrm{Ρ}}\)).
Under this view, a HABLO parser can be a distinct SYM front-end even if it “processes to produce a Bellman-calculus model”: it is simply another SYM tier whose output (after normalization) is the same Bellman-calculus stage language that downstream CORE machinery consumes.
8.3 The Bijection Requirement¶
The critical property of the DDSL directive system is:
The mapping from normalized SYM (post-desugaring) to CORE must be a bijection (one-to-one and onto).
This is compatible with having multiple concrete SYM tiers: different surface syntaxes (e.g., Dolo+ SYM vs a HABLO-style SYM) may differ in their front-end parsing and inference rules, but they should lower into the same normalized SYM stage AST before applying the SYM ↔ T core directives.
8.3.1 Why Bijection?¶
- Round-tripping: Given a CORE representation, we can always produce the equivalent SYM representation. This is essential for:
- Error messages (show users SYM, not CORE)
- Model inspection and debugging
-
Documentation generation
-
Semantic equivalence: A SYM model and its CORE translation have identical mathematical meaning under Υ: $\(\Upsilon(\text{SYM}) = \Upsilon(\phi(\text{SYM})) = \Upsilon(\text{CORE})\)$ where \(\phi : \text{SYM} \to \text{CORE}\) is the forward translation.
-
No information loss: The translation preserves all semantic content. Nothing is "added" or "lost" in translation—only the syntactic representation changes.
Practically: we mean semantic information loss. YAML comments, key ordering, and other purely presentational details are not part of the semantic model and are not expected to round-trip.
- Invertibility: Given a CORE model, we can recover the "canonical" SYM form: $\(\phi^{-1} : \text{CORE} \to \text{SYM}\)$
8.3.2 The Directive System¶
The BE-DDSL directives define the bijection formally. Each directive specifies:
- SYM pattern: What the surface syntax looks like
- CORE pattern: What the internal representation looks like
- Translation rules: How to map between them (both directions)
- Inference rules: What can be inferred vs. what must be explicit
Example directive (informal):
| SYM Pattern | CORE Pattern | Notes |
|---|---|---|
states: [m] |
variables.states: "state_m": m @in X_state |
Space inferred from domain: block |
E[f(x,y)] |
E_y[f(x,y)] where y is declared as shock |
Operator instance inferred from shock variable |
transition: \| m' = ... |
g_av: [xa, y] @mapsto xv @via \| xv = ... |
Transition becomes explicit transition function |
8.3.3 Handling Ambiguity¶
Some SYM constructs may appear ambiguous but must resolve to unique CORE representations. The directive system handles this via:
- Contextual inference: Use surrounding declarations to disambiguate
- Canonical forms: When multiple CORE representations are semantically equivalent, choose a canonical one
- Explicit disambiguation: Require the user to be more explicit if ambiguity cannot be resolved
8.4 Integration with Existing Architecture¶
The two-layer architecture extends the existing Υ/\(\mathbf{\mathrm{Ρ}}\) framework:
┌───────────────────────────────────────────────────┐
│ USER INTERFACE │
│ │
│ SYM (economist-friendly) │
│ - Backwards-compatible with Dolo │
│ - Intuitive, minimal syntax │
└────────────┬──────────────────────┬───────────────┘
│ │
φ (directives:│ │ Ρ₀ (?)
bijection) │ │ direct SYM-to-code
▼ │
┌──────────────────────────────┐ │
│ INTERNAL REPRESENTATION │ │
│ │ │
│ CORE (operator-theoretic) │ │
│ - Explicit typing, op reg. │ │
│ - Methodization functor │ │
│ - Backus FFP principles │ │
└──────┬────────────┬──────────┘ │
│ │ │
Υ (meaning) + Methodization │
│ + Calibration │
▼ ▼ │
┌────────────┐ ┌──────────────┐ │
│ MATH │ │ CORE │ │
│ MEANING │ │ (methodized, │ │
│ (𝔻𝕄) │ │ calibrated) │ │
│ │ └──────┬───────┘ │
│ Bellman │ │ │
│ ops, │ │ Ρ │
│ theorems │ │ │
└────────────┘ ▼ ▼
┌───────────────────────────────────────────────────┐
│ COMPUTATIONAL REPRESENTATION │
│ (𝔽^COMP) │
│ │
│ AST, Python/Julia code │
└───────────────────────────────────────────────────┘
In the ASCII diagram, Ρ is shorthand for the representation map \(\mathbf{\mathrm{Ρ}}\), and Ρ₀ is shorthand for \(\mathbf{\mathrm{Ρ}}_0\).
Key Points:¶
-
Υ applies to CORE: The mathematical meaning map operates on CORE, which has the explicit structure needed for formal interpretation.
-
\(\mathbf{\mathrm{Ρ}}\) requires methodization: The computational representation map applies after methodization and calibration, which also operate on CORE.
-
SYM is purely syntactic sugar: SYM is a notational convenience; all semantic operations happen at the CORE level.
-
Users can write in either layer: Power users can write directly in CORE if they prefer explicit control. Most users write in SYM and let the directives handle translation.
-
Possible direct path \(\mathbf{\mathrm{Ρ}}_0\)(?): there may also be a direct map from SYM to computational code that skips CORE entirely. The question mark is intentional: it is an open question whether such a shortcut can preserve the semantic guarantees of the canonical SYM → φ → CORE → (methodize+calibrate) → \(\mathbf{\mathrm{Ρ}}\) path.
8.5 Implementation Strategy¶
Phase 1: Define Core-Compatible SYM Subset¶
- Identify which current Dolo constructs map cleanly to CORE
- Document the translation rules
- Ensure existing models work unchanged
Phase 2: Implement Forward Translation (φ)¶
- Parser produces CORE from SYM input
- Validate that translation preserves semantics
- Handle inference and disambiguation
Phase 3: Implement Inverse Translation (φ⁻¹)¶
- Generate SYM from CORE for error messages, documentation
- Define canonical SYM forms for CORE constructs
- Ensure round-trip consistency: φ⁻¹(φ(S)) ≈ S (up to canonical form)
Phase 4: Extend SYM Vocabulary¶
- Add new SYM constructs for DDSL features (stages, perches, movers)
- Maintain backwards compatibility
- Document new syntax for users
Status note (implementation reality)¶
As of early 2026, the project already uses a small but important slice of this architecture in practice:
- dolo-plus YAML acts as an initial SYM extension for stages/periods/nests.
- The
doloplus_to_dolopipeline is an early prototype of the forward translation φ into a solver-ready internal representation (currently, vanilla Dolo models).
This is compatible with the design note: it prioritizes forward translation and semantic fidelity first, while full CORE ↔ SYM round-tripping is deferred.
8.6 Relation to Backwards Compatibility¶
The two-layer architecture directly addresses the backwards compatibility constraint:
| Constraint | How Two-Layer Architecture Satisfies It |
|---|---|
| Existing Dolo models must work | SYM layer accepts current syntax unchanged |
| No breaking changes | CORE is internal; users never see it unless they opt in |
| Spirit of current syntax | SYM is designed to feel like natural Dolo evolution |
| Formal guarantees | CORE provides rigorous semantics, invisible to casual users |
8.7 Glossary Additions¶
| Term | Definition |
|---|---|
| SYM | Surface syntax layer; economist-friendly, backwards-compatible |
| CORE | Internal representation; operator-theoretic, formally rigorous |
| φ (phi) | Forward translation map: SYM → CORE |
| φ⁻¹ | Inverse translation map: CORE → SYM |
| Directive | Rule specifying the bijection between SYM and CORE patterns |
| Canonical form | The standard SYM representation for a given CORE construct |
8.8 Key Design Insight: Implicit vs. Explicit Functional Notation¶
The Problem in Current Dolo¶
In Dolo's YAML syntax, when you write:
Several things are implicit:
1. V is a function V: (exogenous × states) → R, not just a scalar
2. V[t] means "evaluate V at current state point"
3. V[t+1] means "evaluate V at next-period states" (requires transition + interpolation)
4. The expectation over shocks is implied by the recursive structure
5. The grid/interpolation is specified elsewhere (options:, domain:)
This works well for users who understand the conventions, but the function/variable distinction is never made explicit.
SYM Approach: Keep Implicit Notation, Explicit Assumptions¶
Key insight: SYM does not need to introduce explicit functional notation at the surface level.
Instead of forcing users to write:
# CORE style (too verbose for applied economists)
V: Xa -> R @mapsto v @via |
v = u(c) + β * E_w[V(g_av(xa, w))]
SYM can preserve familiar Dolo-style syntax:
But the implicit assumptions become explicit in the DDSL directive system:
| Implicit in Syntax | Explicit in Directives/CORE |
|---|---|
V is a function |
V: (exogenous × states) → R |
[t] means current point |
Evaluation at grid point |
[t+1] means next-period |
E_shock[V(transition(state, shock))] |
| Interpolation method | From options.grid → APPROX operator |
| Integration method | From exogenous spec → expectation scheme |
Benefits of This Approach¶
- Zero learning curve: Existing Dolo users see no changes
- Formal semantics available: CORE makes everything explicit
- Round-tripping works: SYM → CORE → SYM preserves intent
- Power users can opt in: Explicit functional notation available if desired
- Better error messages: Errors can reference implicit assumptions
Translation Example¶
SYM (what user writes):
CORE (internal representation):
symbols:
spaces:
"state_space": X_w @def R+
variables:
states:
"wealth": w @in X_w
controls:
"consumption": c @in R+
functions:
outputs:
"value_function": V: X_w -> R
operators:
- name: E_shock
class: expectation
- name: APPROX_V
class: approximation
perches:
decision:
backward:
B_value: V_next -> V @via |
V(w) = log(c_star(w)) + β * E_shock[V_next(g(w, c_star(w), shock))]
The directives specify exactly how the SYM constructs map to CORE constructs.
8.9 Open Questions¶
-
Inference boundaries: How much should SYM infer vs. require explicitly? Trade-off between convenience and potential ambiguity.
-
Extension syntax: When SYM needs new constructs (e.g., for ADC perches), what should they look like?
-
Error messages: When translation fails, how do we report errors in terms the user understands (SYM) rather than internal terms (CORE)?
-
Partial CORE exposure: Should power users be able to mix SYM and CORE syntax in the same file?
-
Where does Bellman-calculus parsing live?: Is Bellman-calculus a CORE sublanguage with its own parser/typechecker (consumed by multiple SYM front-ends), or is “Bellman-calculus parsing” inseparable from each SYM tier’s parsing rules?
-
Minimal SYM contract: What is the smallest set of declarations/equations a SYM tier must make explicit so that, after implicit rules are applied, we deterministically obtain a Bellman-calculus stage (suitable for Υ, methodization, calibration, and then \(\mathbf{\mathrm{Ρ}}\))?
8.10 Summary¶
The SYM / CORE architecture provides:
- User-friendliness: Economists write in familiar, intuitive syntax
- Formal rigor: Internal representation supports all theoretical guarantees
- Backwards compatibility: Existing Dolo models work without modification
- Extensibility: New DDSL features can be added at CORE level and surfaced via SYM
- Round-tripping: Bijection ensures no information loss between layers
This architecture is the key to satisfying both the applied-economist usability requirements and the formal-semantics requirements of the DDSL project.