0.1c Calibration
dolo-plus (ADC dialect) — spec_0.1c (decorators + calibration + methods)¶
This milestone defines the minimum, backwards-compatible way to attach typed symbol metadata (decorators) to a Dolo/dolo-plus model without breaking vanilla Dolo.
The immediate focus is parameter registration (e.g., record that \( \beta \in (0,1) \)), but the same mechanism applies to all symbol groups (states, controls, shocks, settings, spaces, etc.). Parameters are just the first place we can safely add numeric validation via calibration.
Why this milestone exists¶
We want to get as close as possible to the ideal compatibility situation:
- We can parse legacy Dolo YAML using the new frontend(s) (“dolang+” / dolo-plus importer).
- The resulting compiled
Modelobject is still runnable by old Dolo code paths and solvers.
The key trap is: decorators must never become part of a symbol’s name.
If a model ends up with a parameter literally named β @in (0,1), everything downstream breaks (calibration lookup, equation parsing, solver compilation).
So: dolo-plus must keep canonical bare symbol names separate from any decorator text, while still letting the symbols object carry richer declaration data in dolo-plus mode.
Scope (0.1c)¶
0.1c adds three things, in a way that is additive and opt-in:
1) Decorators: Attach decorator applications (e.g. @in (0,1), @in R+, @def ClOp(a,b)) to declared symbols (focus: parameters, but same mechanism applies to all symbol groups).
2) Calibration plumbing: Optionally validate calibrated parameter values against decorators (e.g., β ∈ (0,1)).
3) Methods plumbing: Ensure method/config blocks needed for the “horse” solver are carried through translation unchanged (no new method language yet).
Non-goals (0.1c)¶
- Do not introduce T core-style functional declarations in Dolo+ YAML (e.g. no
@mapsto, no@viabodies). Those belong in T core (formerly “DDSL-CORE”), not in Dolo+ SYM. - Do not redesign Dolo’s solver selection / recipe system.
- Do not require full typechecking of all domains; initial domain validation is intentionally minimal and conservative.
1. Compatibility contract (must hold)¶
1.1 Legacy Dolo compatibility¶
- A plain Dolo model file with:
symbols: {parameters: [β, γ, r], ...}calibration: {β: 0.96, ...}- standard
equations:blocks must continue to import/compile/solve unchanged.
1.2 Decorator safety invariant¶
- Invariant: the canonical symbol name used for calibration lookup / equation parsing / solver compilation is always the bare name (e.g.
β, notβ @in (0,1)). - In dolo-plus mode,
symbolsmay additionally carry decorator metadata and labels, but it must be possible to recover a legacy “names-only” view for the horse / old Dolo code paths.
2. YAML surface syntax (dolo-plus SYM)¶
2.1 Canonical pattern (recommended): decorators live in symbols (uniformly, for any group)¶
In dolo-plus SYM we still want symbols to be “close to Dolo”, but we allow decorated declarations inside symbols itself.
Example (same idea for parameters, states, controls, user-defined spaces, settings, …):
# stage.yml (symbolic stage; no numbers attached)
symbols:
spaces:
# Primitive space constructors like R/R+/… are built in.
# But users may also define new named spaces from primitives + parameters/settings.
X: @def ClOp(a, b)
parameters:
a: @in R
b: @in R
β: @in (0,1)
settings:
# Settings are “parameters for numerics” (grid sizes, tolerances, quadrature nodes, …).
n_w: @in Z+
tol: @in R+
states:
w: @in X
controls:
c: @in R+
# calibration.yml (parameters only; applied by the calibration functor)
calibration:
parameters:
a: 0.0
b: 10.0
β: 0.96
# settings.yml (numerical settings values; applied by the settings functor / attachment)
settings:
n_w: 200
tol: 1.0e-8
Notes:
- Values like
@in (0,1)are DDSL syntax objects (a decorator application) with meaning under \( \Upsilon \). - Here
@inis the decorator token (a primitive operator indicated by the leading@), and(0,1)is its argument. - YAML carries
@in (0,1)as a scalar, but dolo-plus should treat it as parseable syntax, not "just metadata". - Likewise,
@def ClOp(a,b)is a decorator application defining a new named spaceXfrom primitive objects plus symbol references. - A declaration value may be either a single decorator application or a list of decorator applications; normalize internally to a list.
- In plain Dolo YAML,
symbols.parameters: [β, γ, r]continues to work unchanged; this is a dolo-plus SYM extension. - Likewise for other groups: plain Dolo list forms remain valid; dolo-plus may accept the mapping form where values are decorator applications.
- Avoid duplicate YAML keys: YAML mappings with duplicated keys silently overwrite earlier entries in many parsers. The dolo-plus importer should detect duplicates (at least for
symbols.*) and raise a clear error instead of accepting silently. - Keep equation payloads “pure”: anything inside a literal block scalar under
equations: ...: |becomes part of the equation text passed to dolang. Put explanations as YAML#comments outside the|payload (not inside it).
Distribution decorator (@dist) for exogenous shocks¶
Exogenous shocks may carry a distribution declaration alongside their domain constraint. This uses the @dist decorator with a distribution constructor:
symbols:
exogenous:
y:
- @in Y # Domain constraint
- @dist Normal(μ_y, σ_y) # Distribution specification
Key points:
@distis a decorator that takes a distribution constructor as its argument.- Distribution constructors are primitive DDSL objects (e.g.,
Normal,LogNormal,Uniform,Beta). - Constructor arguments may be symbol references (parameters like
μ_y,σ_y) or literals. - The distribution parameters are resolved during calibration — the syntactic stage only records the structure.
- Multiple decorators on a single symbol are expressed as a YAML list.
Primitive distribution constructors (v0.1c):
| Constructor | Arity | Arg Types | Description |
|---|---|---|---|
Normal(μ, σ) |
2 | (scalar, scalar) | Gaussian with mean μ, std σ |
LogNormal(μ, σ) |
2 | (scalar, scalar) | Log-normal |
Uniform(a, b) |
2 | (scalar, scalar) | Uniform on [a, b] |
Beta(α, β) |
2 | (scalar, scalar) | Beta distribution |
UNormal(σ) |
1 | (scalar,) | Normal with mean 0, std σ (Dolo shorthand) |
Note: Distributions are not domains — @in specifies the domain (support), @dist specifies the measure.
2.2 Optional sugar (allowed, but not required in 0.1c)¶
We may later accept “full declaration strings” (e.g. value β @in (0,1)) if and only if the importer strips to the bare name (β) and stores the decorator metadata in a sidecar declarations object (see model.symbols_math below). This is optional and should not be required for 0.1c.
2.3 Ordering + labels (clarifications)¶
Ordering for mapping-valued symbol groups¶
Legacy Dolo treats symbol groups like symbols.parameters as ordered lists (order matters for vectorization and solver interfaces).
When dolo-plus accepts the mapping form (e.g. parameters: {β: @in (0,1), γ: @in R+}), we must define how to recover a deterministic list[str] ordering for the legacy view:
- Rule (v0.1c): preserve the YAML insertion order as encountered in the YAML node (PyYAML’s
yaml.compose(...)preserves key order in the node’s.valuelist).
If a user needs a particular ordering, they should write the keys in that order in the YAML file.
Labels (future-compatible, not required in v0.1c)¶
Foundations often includes human labels (e.g. “discount factor”) alongside symbol names.
v0.1c does not require labels in the surface syntax. If/when we add label authoring support, labels should live alongside declarations in the same “symbols-math” sidecar object (so that bare names remain unchanged).
3. Internal representation (Model object)¶
3.1 Keep Dolo’s existing model representation: attach metadata as separate objects¶
To minimize new classes and preserve backwards compatibility, we keep the legacy core objects and attach new information as sidecar dictionary-like objects.
One-page “where does it live?” cheat sheet (authoring vs. storage)¶
In YAML (authoring surface):
symbols.parameters/symbols.settings: declare which names exist (optionally with decorators like@in ...).- calibration/settings numbers live outside the stage file (e.g. in a separate calibration/settings file).
In the in-memory model representation (storage):
-
model.symbols["parameters"]andmodel.symbols["settings"]are names-only ordered lists (bare strings).
These lists are what legacy Dolo uses for ordering/vectorization. -
model.symbols_math["parameters"][name]andmodel.symbols_math["settings"][name]store the parsed declarations (domains/defs/dists) for those same bare names.
This object is attached by the importer (Dolo-side), but it is built from dolang/Lark parsing of decorator strings. -
model.calibration[name]stores parameter values (reuse the existing Dolo calibration dict/list object). model.settings[name]stores settings values (same shape/type as calibration, but a separate object).
So:
- symbols.settings (syntax) ⇒ model.symbols["settings"] (names) + model.symbols_math["settings"] (domains) + model.settings (numbers)
- symbols.parameters (syntax) ⇒ model.symbols["parameters"] (names) + model.symbols_math["parameters"] (domains) + model.calibration (numbers)
Concretely:
-
model.symbols: legacy names-only view (group → orderedlist[str]of bare symbol names).
This remains the canonical source for calibration lookup, equation parsing, and solver compilation. -
model.symbols_math(dolo-plus only): declarations / “symbols-math” attachment.
This stores the parsed decorator applications that give symbols their mathematical role/type: @indomain membership (e.g.β ∈ (0,1))@defspace/domain definitions (e.g.X @def ClOp(a,b))@distdistribution declarations for shocks
Minimal requirement:
- for each group, you can recover the declarations in the same order as model.symbols[group] (so the “symbols-math” object can be treated like an ordered list, analogous to calibration vectors), and
- you can look up a symbol’s declarations by bare name (e.g. model.symbols_math[group][name] -> list[AST], or an equivalent mapping).
No new SymbolDecl/DecoratorApp classes are required; storing dolang/Lark AST nodes is sufficient.
-
model.calibration: existing Dolo calibration object (e.g.CalibrationDict) for parameters.
In Foundations language, this object is the concrete representation of the parameter calibration functor \( \mathcal{C}_{\text{param}} \). -
model.settings: settings values attachment, shaped like calibration (dict/list) but forsymbols.settings(note the name clash:symbols.settings= names in syntax,model.settings= numbers in memory).
(Implementation may reuse the same underlying calibration-dict type; the point is that settings are a separate object.) -
model.methods(0.1d): methods attachment mapping operator instances (e.g.E_y,mover:...:approx:c) to scheme/method selections and settings-symbol wiring.
This design keeps the “what vs how” separation:
- stage syntax (names + equations) stays stable (model.symbols, equations, timing metadata)
- “math typing” lives in model.symbols_math
- numbers live in model.calibration (parameters) and model.settings (numerics)
- numerical choices live in model.methods
4. Parsing + validation rules (0.1c)¶
4.1 Parse order¶
1) Parse YAML
2) Build model.symbols (legacy names-only view):
- legacy list-of-names groups remain list-of-names
- mapping-valued groups contribute keys (bare names) in YAML insertion order
3) Build model.symbols_math (dolo-plus sidecar declarations):
- Required Lark step: each decorator application scalar (e.g. @in (0,1)) must be parsed with dolang’s Lark parser into a structured form (AST), rather than being kept as an opaque Python string.
- Store parsed declarations under model.symbols_math keyed by the same bare names used in model.symbols.
- This keeps responsibilities clean:
- dolang provides parsing (strings → AST),
- dolo attaches/stores the result on the model representation (model.symbols_math).
- Lexing note (important): primitives like R+ / R++ must not be interpreted as arithmetic (R then +). For v0.1c we should either:
- canonicalize to an identifier-safe alias before parsing (e.g. rewrite R+ → R_plus, R++ → R_plusplus using the registry), then parse; or
- define a dedicated decorator/domain start rule with tokens that treat R+ as a single primitive token in that context.
4) If calibration exists, optionally validate against model.symbols_math
4.2 Decorator table validation¶
In dolo-plus mode:
- For any symbol group where
symbols.<group>is mapping-valued: - each key must be a valid bare symbol name, and
- each value must parse as a decorator application (or a list of applications).
- If a decorator application references an unknown symbol (e.g., refers to
Xas a domain butsymbols.spacesdoes not defineXandXis not a primitive domain), raise a clear error explaining how to fix it.
4.2.1 Failure modes: parse vs resolve vs validate¶
To avoid confusion in implementation, distinguish:
1) Syntactic parse failure: decorator RHS doesn’t parse under the decorator/domain grammar ⇒ error with localized message.
2) Resolution failure: w: @in X where X is neither a primitive nor defined in symbols.spaces ⇒ error (cannot build the typing environment).
3) Resolution success, validation skipped: decorator parses and resolves, but 0.1c doesn’t enforce its numeric meaning ⇒ store the AST and skip numeric validation (metadata retained).
4.3 Implementation requirement: parse decorator applications via dolang (Lark)¶
Current vanilla Dolo behavior does not send the symbols: block through dolang/Lark; it just reads raw YAML scalars as names.
For dolo-plus SYM decorated declarations we must add an explicit parse step:
- For each mapping-valued entry like
β: @in (0,1): - take the YAML scalar
@in (0,1)and calldolang.symbolic.parse_string(...)with a dedicated start rule (e.g.decorator_applicationorsymbol_decl_rhs), - keep the resulting AST (or convert to a minimal normalized form), and
- store it in
model.symbols_mathunder the symbol’s bare name (dolo-plus mode).
The exact start rule name is up to implementation, but the contract is:
- decorator applications are parsed by Lark (same parsing technology as equations), and
- parse failures yield clear, localized errors pointing to the offending symbol declaration.
4.4 Pyramidal resolution: primitives → spaces → memberships¶
In dolo-plus mode, symbol declarations can reference other declared symbols, so symbols construction must be pyramidal (multi-pass), not “flat”.
Minimum required resolution behavior:
1) Parse all decorator applications into AST objects (per §4.3).
2) Resolve primitive domain objects:
- R, R+, R++, etc. are primitive DDSL model objects (built in).
3) Resolve user-defined spaces (symbols.spaces):
- Each space declaration X: @def <DomainExpr> introduces a named domain symbol X.
- <DomainExpr> may reference:
- primitive objects (e.g. R+), and/or
- other previously-defined spaces, and/or
- parameters/settings used as constructor arguments (e.g. ClOp(a,b)).
- The resolver must build a dependency graph and reject cycles in space definitions.
4) Resolve memberships (@in) in other groups:
- When a declaration says w: @in X, the resolver must link X to the space definition from symbols.spaces (or recognize it as primitive).
- If the domain name cannot be resolved, raise a clear error.
Note: v0.1c only needs name resolution + structural validation here. Numeric evaluation of ClOp(a,b) can remain deferred until calibration/representation layers.
4.5 Minimal domain validation (parameter registration)¶
For 0.1c we only require safe checks with high signal and low parsing complexity.
Recognize the following @in domains:
R,R+,R++- open/closed intervals over numerals:
(a,b),[a,b]
If a domain is unrecognized, store it but do not reject the model.
Validation policy (recommended):
- If
@in (0,1)and calibration providesβ: 1.2⇒ error. - If
@in R+and calibration providesγ: -1⇒ error. - If no calibration exists yet ⇒ no numeric validation (metadata only).
4.6 Language-agnostic registry for decorators and primitive objects¶
We need a single “source of truth” for:
- which decorator tokens exist (e.g.
@in,@def), and - which primitive DDSL model objects exist (e.g.
R+), including primitive domain constructors (e.g.ClOp).
This should be stored in a language-agnostic format so that:
- Python (dolang/Dolo) can load it for parsing/validation,
- other implementations (Julia, Rust, etc.) can consume the same registry, and
- we avoid hardcoding primitive lists into the Lark grammar or Python code.
Proposal (v0.1c):
- Add a YAML registry file under the dolang package, e.g.:
AI/prompts/dev-specs/dolo+/spec_0.1/0.1_c/ddsl_registry.yaml
Registry responsibilities:
- Declare canonical spellings + arity for:
- decorator tokens (names beginning with
@), - primitive domains (
R,R+, …), - domain constructors (
ClOp, …), - and optional aliases (
R_+→R+, etc.).
How it is used (minimum requirement):
- The dolo-plus importer loads this registry at model-load time.
- The decorator-application parser (§4.3) uses the registry to validate decorator tokens and constructor arity.
- The pyramidal resolver (§4.4) uses the registry to recognize primitives like
R+as built-in objects.
We do not need to settle full semantics of all primitives in v0.1c — only identity, arity, and basic structural validation.
Scope note: this is not the methodization / schemes registry¶
Foundations also uses a separate registry for numerical schemes/methods (often called an operation registry / scheme registry).
To avoid confusion:
ddsl_registry.yaml(this section) = syntax + typing primitives (decorators, primitive domains, domain constructors, aliases).operation-registry.yml(Foundations) = numerical methodization choices (which schemes exist; which methods are available for a given operator class).
4.7 Separate calibration file + calibration functor (Foundations-style)¶
Foundations treats calibration as a separate functor applied after we have a purely syntactic stage.
We want the following pipeline (minimal v0.1c target):
symbolic_stage = model(stage_file)(parses syntax + symbols + equations; no numbers attached) (sthis step should also create the symbols_math)calibration_stage = calibrate(symbolic_stage, calibration_file)(attaches numbers / bindings)- then we should have a settingize_stage...
Formal definition: calibration acts on the declared parameter list¶
Important clarification: The calibration functor is not an arbitrary map from names to values. It is a structured morphism whose domain is the declared parameter list from symbols.parameters.
Formally:
Where:
- PARAMS is the ordered list of parameter symbols declared in symbols.parameters
- O_P is the space of primitive numerical objects (floats, arrays)
This means:
1. Only symbols declared in symbols.parameters can be calibrated via the parameter calibration functor
2. The calibration must respect the declared ordering (calibration arrays correspond positionally to symbol lists)
3. Values must satisfy any @in constraints from the declarations
4. The domain of the calibration functor is constrained by the stage's symbol declarations — it is not a free-form key-value store
Similarly for settings (post-methodization):
The combined calibration functor is:
What “settings” are (and why this is new relative to vanilla Dolo)¶
Settings are “parameters for numerics”: they feed method schemes (grid sizes, tolerances, quadrature node counts, interpolation orders, etc.).
Vanilla Dolo does not have a first-class symbols.settings group; it typically hardcodes these choices as numbers inside options: / domain: or inside algorithm calls.
In dolo-plus (Foundations-aligned), we introduce:
symbols.settingsas a first-class symbol group (declared and typed with decorators like@in Z+), and- a corresponding settings binding component \( \mathcal{C}_{\text{settings}} \) that binds numeric values (typically after methodization has selected specific schemes).
At v0.1c we don’t need to implement full methodization semantics, but we do need to:
- declare settings symbols,
- calibrate them from a separate file,
- and make them available to downstream method/config evaluation.
This formal structure ensures that:
- calibration['parameters'][i] ↔ symbols['parameters'][i] (positional correspondence)
- Calibration cannot introduce symbols not declared in the stage
- Type constraints (@in) are enforceable because the domain is known
Key requirements:
- Calibration is a separate file (language-agnostic YAML). The stage file should be loadable without it.
- Calibrating attaches numbers as separate objects:
model.calibration(parameters; reuse Dolo’s existing calibration dict/list object), andmodel.settings(settings; same shape, separate object). The syntactic stage (model.symbols, equations) is unchanged; any “bound value” view is a presentation/API choice, not a data-model requirement.- Backwards compatibility: plain Dolo single-file models that already have a
calibration:block must continue to work unchanged.
File split (v0.1c)¶
- Stage file (no calibration):
- contains
symbols:(possibly decorated),equations:,dolo_plus:metadata, etc. - Calibration file (parameters only):
- contains parameter bindings, preferably as
calibration.parameters: {...} - (legacy synonym) a flat
calibration: {β: 0.96, ...}is treated ascalibration.parameters - Settings file (numerical settings only):
- contains
settings: {...}(values for names declared insymbols.settings) - (optional legacy combined form)
calibration.settings: {...}is accepted if the user wants a single file, but tooling should still split it into the separatemodel.settingsattachment.
Backwards compatibility:
- If a plain Dolo model embeds a flat
calibration:mapping in the stage YAML, we treat it as legacy input (nosettingsgroup required). - In dolo-plus mode we prefer parameter-only calibration + a separate
settings:mapping so the “parameters vs numerics” distinction stays explicit.
Schema note (v0.1c):
- Accept either:
calibration: {parameters: ...}andsettings: {...}(preferred; may be separate files), orcalibration: {β: 0.96, ...}(legacy Dolo synonym forcalibration.parameters), orcalibration: {parameters: ..., settings: ...}(legacy combined form; accepted but discouraged), orparameters: {...}/settings: {...}at top-level (optional shorthand; treated as the bodies of the calibration/settings attachments).
“Re-jig” current Dolo calibration implementation (minimal reuse strategy)¶
Current Dolo computes calibration from model.data["calibration"] via:
SymbolicModel.get_calibration()(builds a dict, solves triangular system), thenModel.calibration(vectorizes viacalibration_to_vector(...), wraps inCalibrationDict).
For v0.1c we want to reuse this machinery, but decouple it from “calibration must live inside the stage YAML”:
- Implement
calibrate(stage_model, calibration_yaml_or_path)such that it: - loads the calibration mapping from a separate YAML file,
- computes a calibration dict using the existing triangular-solver logic,
- then produces a calibrated stage/model object by attaching
model.calibration/model.settingswithout mutating the syntactic stage in-place.
Recommended pattern (v0.1c):
- A calibrated object is conceptually a pair:
(symbolic_stage, calibration_functor)(and similarly for settings). - Concretely, reuse existing Dolo objects:
model.calibrationis the concrete parameter calibration functor \( \mathcal{C}_{\text{param}} \)model.settingsis the concrete settings binding \( \mathcal{C}_{\text{settings}} \)- the stage’s declarations live in
model.symbols(names) andmodel.symbols_math(domains/defs) - No new wrapper class is required; implementations may return an existing
Modelinstance or attach these dict-like objects to the parsed stage representation.
This preserves the “functorial” intent (same stage, multiple calibrations) and avoids accidental mutation bugs.
The critical contract is that after calibration:
- dolo-plus/DDSL tooling can read values from
model.calibration(parameters) andmodel.settings(numerics), and - legacy Dolo code paths can still use
model.calibrationunchanged for compilation and evaluation.
5. Methods plumbing (horse compatibility)¶
0.1c does not introduce a new methodization language.
Instead:
- Any Dolo method/config blocks that the horse solver relies on (e.g.
options:,domain:, recipe-specific sections) must pass through unchanged in the dolo-plus → horse translation. - In later work, the same decorator mechanism can be used to type-check “settings” and method configuration values, but that is out of scope here.
6. Acceptance criteria (0.1c)¶
- [ ] Legacy import unchanged: existing vanilla Dolo example YAML files import/compile/solve exactly as before.
- [ ] Decorator metadata is non-invasive:
- the model imports when decorated declarations are present under
symbols, - the canonical bare names remain available for calibration/compilation,
- decorator metadata is available via
model.symbols_math(a separate declarations object), whilemodel.symbolsremains names-only. - [ ] Ordering is well-defined: mapping-valued symbol groups preserve YAML insertion order when producing the legacy names-only
list[str]view. - [ ] Decorator parsing uses Lark: decorator applications in
symbolsare parsed via dolang’s Lark parser into structured objects (AST) and stored undermodel.symbols_math(not left as opaque strings). - [ ]
R+/R++are handled safely: primitive domains with+are either canonicalized to identifier-safe aliases before parsing (via the registry) or tokenized as primitives in the decorator/domain grammar (no accidental arithmetic parsing). - [ ] User-defined spaces resolve: can declare
symbols.spaces.X: @def ClOp(a,b)(witha,bdeclared) and then use@in Xelsewhere (e.g.states.w: @in X) without ambiguity; unresolved names and cyclic space defs are rejected. - [ ] Registry exists and is used: a language-agnostic registry file (e.g.
AI/prompts/dev-specs/dolo+/spec_0.1/0.1_c/ddsl_registry.yaml) exists listing decorator tokens and primitive objects (incl.R+,ClOp), and the dolo-plus importer + decorator-parser consult it for validation/resolution. - [ ] Calibration is functorial + split-file: can load a stage file without
calibration:and then applycalibrate(stage, calibration_file)to produce a calibrated object where numeric values live inmodel.calibration(parameters) andmodel.settings(numerics), while preserving the legacyCalibrationDictinterface for old Dolo compilation. - [ ] Calibration does not mutate the symbolic stage:
calibrate(...)returns a fresh object (e.g. a newModelor a copied stage representation) so the same syntactic stage can be reused across multiple calibrations — without requiring a newCalibratedStageclass. - [ ] Settings are first-class: can declare
symbols.settings(typed with@in) and bind values viasettings: {...}(or legacycalibration.settings); settings values are available viamodel.settingsto method/config evaluation (grid sizes, tolerances, etc.). - [ ] Calibration validation (minimal): if a parameter has an
@indecorator in a supported domain form, out-of-domain calibrated values raise a clear error. - [ ] Horse path remains runnable: adding parameter decorators (inside
symbols) to a stage-mode dolo-plus model does not break 0.1b translation or 0.1d solver execution (decorators are metadata only).
7. Implementation Plan: Registry Integration (Phase 1.1 fork)¶
Design principle (per §3.1): No new classes. Use sidecar dicts + Lark ASTs:
- model.symbols stays names-only (list[str])
- model.symbols_math stores Lark ASTs keyed by bare name
- model.calibration / model.settings use existing Dolo machinery
Target branches: phase1.1_0.1 in both packages/dolo and packages/dolang
Guiding principle: Minimal, backwards-compatible changes. Legacy Dolo must continue to work unchanged.
7.1 Overview¶
This implementation adds the DDSL registry as a loadable YAML file and provides Python utilities to: 1. Load and validate the registry at import time 2. Provide lookup functions for decorators, primitives, constructors, and aliases 3. Expose identifier-safe canonicalization for parser integration
No changes to existing Dolo/Dolang behavior — this is purely additive infrastructure.
7.2 File Changes¶
A. New files to create¶
| Package | File | Purpose |
|---|---|---|
dolang |
dolang/ddsl_registry.yaml |
Registry data (copy from devspec, deduplicated) |
dolang |
dolang/registry.py |
Registry loader + lookup API |
B. Files to modify (minimal touch)¶
| Package | File | Change |
|---|---|---|
dolang |
dolang/__init__.py |
Export registry module (optional, for convenience) |
C. Files unchanged¶
dolang/grammar.lark— no grammar changes yet (decorator parsing is phase 2)dolang/grammar.py— no changesdolo/compiler/model.py— no changes (registry is infrastructure only)dolo/compiler/misc.py— no changes
7.3 New file: dolang/ddsl_registry.yaml¶
Copy and deduplicate from AI/prompts/dev-specs/dolo+/spec_0.1/0.1_c/ddsl_registry.yaml:
version: "0.1"
# Language-agnostic registry of DDSL primitives and decorator tokens.
# v0.1 goal: identity + arity + basic structural validation (not full semantics).
decorators:
"@in":
kind: membership
arity: 1
arg_kinds: ["domain_expr"]
description: "Set membership / typing constraint"
"@def":
kind: definition
arity: 1
arg_kinds: ["domain_expr"]
description: "Define a named space (domain)"
primitive_objects:
domains:
"R":
description: "Real numbers"
"R+":
description: "Nonnegative reals"
"R++":
description: "Positive reals"
"Z":
description: "Integers"
"Z+":
description: "Nonnegative integers"
constructors:
"ClOp":
kind: domain_constructor
arity: 2
arg_kinds: ["scalar", "scalar"]
description: "Closed-open interval [a,b)"
# Aliases: alias -> canonical
aliases:
"R_plus": "R+"
"R_plusplus": "R++"
"Z_plus": "Z+"
# Identifier-safe forms: canonical -> identifier-safe token
identifier_safe:
"R+": "R_plus"
"R++": "R_plusplus"
"Z+": "Z_plus"
7.4 New file: dolang/registry.py¶
"""
DDSL Registry Loader
Provides access to DDSL primitives (decorators, domains, constructors, aliases).
This module loads ddsl_registry.yaml once at import time and exposes lookup functions.
Usage:
from dolang.registry import (
get_decorator,
is_primitive_domain,
get_constructor,
canonicalize,
to_identifier_safe,
)
"""
from pathlib import Path
from typing import Any, Dict, Optional
import yaml
__all__ = [
"REGISTRY",
"get_decorator",
"is_decorator",
"is_primitive_domain",
"get_primitive_domain",
"get_constructor",
"canonicalize",
"to_identifier_safe",
]
# ---------------------------------------------------------------------------
# Load registry once at import time
# ---------------------------------------------------------------------------
_REGISTRY_PATH = Path(__file__).parent / "ddsl_registry.yaml"
def _load_registry() -> Dict[str, Any]:
"""Load the DDSL registry from YAML."""
if not _REGISTRY_PATH.exists():
raise FileNotFoundError(f"DDSL registry not found: {_REGISTRY_PATH}")
with open(_REGISTRY_PATH, "r", encoding="utf-8") as f:
return yaml.safe_load(f)
REGISTRY: Dict[str, Any] = _load_registry()
# ---------------------------------------------------------------------------
# Decorator lookups
# ---------------------------------------------------------------------------
def get_decorator(name: str) -> Optional[Dict[str, Any]]:
"""Get decorator spec by name (e.g., '@in'). Returns None if not found."""
return REGISTRY.get("decorators", {}).get(name)
def is_decorator(name: str) -> bool:
"""Check if name is a registered decorator."""
return name in REGISTRY.get("decorators", {})
# ---------------------------------------------------------------------------
# Primitive domain lookups
# ---------------------------------------------------------------------------
def is_primitive_domain(name: str) -> bool:
"""Check if name is a primitive domain (e.g., 'R+')."""
canonical = canonicalize(name)
return canonical in REGISTRY.get("primitive_objects", {}).get("domains", {})
def get_primitive_domain(name: str) -> Optional[Dict[str, Any]]:
"""Get primitive domain spec by name. Returns None if not found."""
canonical = canonicalize(name)
return REGISTRY.get("primitive_objects", {}).get("domains", {}).get(canonical)
# ---------------------------------------------------------------------------
# Constructor lookups
# ---------------------------------------------------------------------------
def get_constructor(name: str) -> Optional[Dict[str, Any]]:
"""Get constructor spec by name (e.g., 'ClOp'). Returns None if not found."""
return REGISTRY.get("constructors", {}).get(name)
# ---------------------------------------------------------------------------
# Alias resolution
# ---------------------------------------------------------------------------
def canonicalize(name: str) -> str:
"""Resolve alias to canonical form. Returns name unchanged if not an alias."""
aliases = REGISTRY.get("aliases", {})
return aliases.get(name, name)
def to_identifier_safe(name: str) -> str:
"""Convert canonical form to identifier-safe token (e.g., 'R+' -> 'R_plus')."""
id_safe = REGISTRY.get("identifier_safe", {})
return id_safe.get(name, name)
def from_identifier_safe(name: str) -> str:
"""Convert identifier-safe token back to canonical form."""
# Reverse lookup
id_safe = REGISTRY.get("identifier_safe", {})
reverse = {v: k for k, v in id_safe.items()}
return reverse.get(name, name)
7.5 Modify: dolang/__init__.py¶
Add optional export (minimal change):
# Existing imports ...
# DDSL registry (v0.1c)
try:
from . import registry
except ImportError:
registry = None # Graceful fallback if registry not yet installed
7.6 Test file: dolang/tests/test_registry.py¶
"""
Tests for DDSL registry loader.
"""
import pytest
from dolang.registry import (
REGISTRY,
get_decorator,
is_decorator,
is_primitive_domain,
get_primitive_domain,
get_constructor,
canonicalize,
to_identifier_safe,
from_identifier_safe,
)
class TestRegistryLoad:
def test_registry_loads(self):
assert REGISTRY is not None
assert "version" in REGISTRY
assert REGISTRY["version"] == "0.1"
def test_decorators_exist(self):
assert "decorators" in REGISTRY
assert "@in" in REGISTRY["decorators"]
assert "@def" in REGISTRY["decorators"]
class TestDecoratorLookup:
def test_get_decorator_in(self):
dec = get_decorator("@in")
assert dec is not None
assert dec["kind"] == "membership"
assert dec["arity"] == 1
def test_get_decorator_def(self):
dec = get_decorator("@def")
assert dec is not None
assert dec["kind"] == "definition"
def test_get_decorator_unknown(self):
assert get_decorator("@unknown") is None
def test_is_decorator(self):
assert is_decorator("@in") is True
assert is_decorator("@def") is True
assert is_decorator("@foo") is False
class TestPrimitiveDomains:
def test_is_primitive_domain(self):
assert is_primitive_domain("R") is True
assert is_primitive_domain("R+") is True
assert is_primitive_domain("R++") is True
assert is_primitive_domain("Z") is True
assert is_primitive_domain("Z+") is True
assert is_primitive_domain("Foo") is False
def test_is_primitive_via_alias(self):
# R_plus is an alias for R+
assert is_primitive_domain("R_plus") is True
def test_get_primitive_domain(self):
dom = get_primitive_domain("R+")
assert dom is not None
assert "description" in dom
class TestConstructors:
def test_get_constructor_clop(self):
con = get_constructor("ClOp")
assert con is not None
assert con["arity"] == 2
assert con["kind"] == "domain_constructor"
def test_get_constructor_unknown(self):
assert get_constructor("Unknown") is None
class TestAliases:
def test_canonicalize_alias(self):
assert canonicalize("R_plus") == "R+"
assert canonicalize("R_plusplus") == "R++"
assert canonicalize("Z_plus") == "Z+"
def test_canonicalize_already_canonical(self):
assert canonicalize("R+") == "R+"
assert canonicalize("R") == "R"
def test_canonicalize_unknown(self):
assert canonicalize("Foo") == "Foo"
class TestIdentifierSafe:
def test_to_identifier_safe(self):
assert to_identifier_safe("R+") == "R_plus"
assert to_identifier_safe("R++") == "R_plusplus"
assert to_identifier_safe("Z+") == "Z_plus"
def test_to_identifier_safe_already_safe(self):
assert to_identifier_safe("R") == "R"
def test_from_identifier_safe(self):
assert from_identifier_safe("R_plus") == "R+"
assert from_identifier_safe("R_plusplus") == "R++"
def test_roundtrip(self):
for canonical in ["R+", "R++", "Z+"]:
safe = to_identifier_safe(canonical)
assert from_identifier_safe(safe) == canonical
7.7 Commit sequence¶
Commit 1: Add DDSL registry infrastructure to dolang
feat(dolang): add DDSL registry loader (v0.1c)
- Add ddsl_registry.yaml with decorators, primitives, constructors, aliases
- Add registry.py with lookup functions (get_decorator, is_primitive_domain, etc.)
- Add test_registry.py with unit tests
- Export registry module from __init__.py
This is infrastructure-only; no changes to grammar or existing behavior.
Existing Dolo models continue to work unchanged.
Ref: spec_0.1c §4.6 (language-agnostic registry)
Files changed:
- dolang/ddsl_registry.yaml (new)
- dolang/registry.py (new)
- dolang/tests/test_registry.py (new)
- dolang/__init__.py (minimal edit)
7.8 Verification checklist¶
Before committing:
- [ ]
pytest packages/dolang/dolang/tests/test_registry.pypasses - [ ]
pytest packages/dolang/dolang/tests/(all existing tests) still pass - [ ]
pytest packages/dolo/dolo/tests/(all existing tests) still pass - [ ] Import
from dolang.registry import REGISTRYworks - [ ] Legacy Dolo example models still load and solve (run a smoke test)
7.9 Future phases (out of scope for this commit)¶
| Phase | Scope |
|---|---|
| 1.1b | Use registry in decorator parsing (grammar extension) |
| 1.1c | Integrate registry lookups into model.symbols processing |
| 1.2 | Full decorator AST + validation pipeline |
| 1.3 | Calibration functor implementation |
7.10 Phase 2: Symbol object changes in dolo (model.py)¶
The registry alone is not enough — we need to modify dolo/compiler/model.py to actually parse and store decorated symbols.
Design principle (per §3.1): No new classes. Use plain dicts + Lark ASTs as sidecar attachments.
A. New file: dolang/decorator_parser.py¶
Parser for decorator applications (uses Lark). Returns Lark parse trees directly — no custom classes.
"""
Decorator Application Parser
Parses decorator strings like "@in (0,1)" or "@def R+" into Lark ASTs.
"""
from typing import Optional
from lark import Lark
from dolang.registry import canonicalize
# Minimal grammar for decorator applications
DECORATOR_GRAMMAR = r'''
start: decorator_app
decorator_app: DECORATOR domain_expr
domain_expr: interval
| primitive_domain
| constructor_call
| CNAME
interval: "(" number "," number ")" -> open_interval
| "[" number "," number "]" -> closed_interval
| "(" number "," number "]" -> left_open_interval
| "[" number "," number ")" -> right_open_interval
constructor_call: CNAME "(" arg_list ")"
arg_list: (CNAME | number) ("," (CNAME | number))*
primitive_domain: PRIMITIVE_DOMAIN
PRIMITIVE_DOMAIN: "R" "+"* | "Z" "+"*
number: SIGNED_NUMBER
DECORATOR: /@[a-z]+/
CNAME: /[A-Za-z_][A-Za-z0-9_]*/
%import common.SIGNED_NUMBER
%import common.WS
%ignore WS
'''
_parser = None
def get_parser():
global _parser
if _parser is None:
_parser = Lark(DECORATOR_GRAMMAR, start='start', parser='lalr')
return _parser
def parse_decorator(raw: str):
"""
Parse a decorator application string into a Lark Tree.
Args:
raw: e.g., "@in (0,1)" or "@def R+"
Returns:
Lark Tree object, or None if parse fails.
Example:
>>> tree = parse_decorator("@in (0,1)")
>>> tree.data
'start'
"""
try:
return get_parser().parse(raw.strip())
except Exception:
return None
B. Modify: dolo/compiler/model.py¶
Attach symbols_math as a plain dict of Lark ASTs. No new classes — just a sidecar dict.
# In SymbolicModel class, add after building self.__symbols__:
# Build symbols_math sidecar (dolo-plus mode)
self.__symbols_math__ = {} # group -> {name -> Lark Tree}
for sg, seq in mapping_items(symbols_node):
if isinstance(seq, yaml.nodes.MappingNode):
self.__symbols_math__[sg] = {}
for name_node, decor_node in seq.value:
name = name_node.value
raw_decor = decor_node.value if decor_node.value else ""
if raw_decor.startswith("@"):
from dolang.decorator_parser import parse_decorator
tree = parse_decorator(raw_decor)
if tree:
self.__symbols_math__[sg][name] = tree
# Expose via property
@property
def symbols_math(self):
"""Decorated symbol metadata (Lark ASTs). Empty dict for legacy models."""
if not hasattr(self, '__symbols_math__'):
return {}
return self.__symbols_math__
C. Files to modify (summary)¶
| Package | File | Change |
|---|---|---|
dolang |
dolang/decorator_parser.py |
NEW: Lark parser, returns Tree directly |
dolo |
dolo/compiler/model.py |
MODIFY: attach symbols_math dict of Lark ASTs |
No symbols.py file needed. No new classes.
D. Commit¶
Commit 2: Add decorator parser + symbols_math attachment
feat(dolang,dolo): add decorator parsing and symbols_math sidecar (v0.1c)
- Add decorator_parser.py with Lark grammar for @in/@def syntax
- Modify model.py to attach symbols_math dict (group -> {name -> Lark Tree})
- Legacy list format continues to work unchanged
- No new classes — just plain dicts + Lark ASTs
Backwards-compatible: existing models work unchanged.
Ref: spec_0.1c §3.1, §4.3
7.11 Phase 3: Calibration Functor (thin wrapper)¶
Per §4.7, calibration is a separate functor. We use a thin wrapper around existing Dolo machinery:
- No modifications to
SymbolicModelorModelclasses - A simple
calibrate()function that loads calibration from a separate file - Reuses existing Dolo
CalibrationDictand triangular solver
Design principle: Don't reinvent — just wrap existing machinery.
A. Simple calibrate() function¶
Add to existing dolo/compiler/model.py or a small dolo/compiler/calibration.py:
"""
Calibration Functor (Thin Wrapper)
Loads calibration from separate YAML and builds existing CalibrationDict.
"""
from pathlib import Path
from typing import Union
import yaml
def load_calibration(calibration_source: Union[str, Path, dict]) -> dict:
"""
Load calibration data from file or dict.
Accepts:
- Path to YAML file
- Dict with flat format: {β: 0.96, r: 0.03}
- Dict with split format: {parameters: {...}, settings: {...}}
Returns:
Flat dict of calibration values.
"""
# Load from file if path
if isinstance(calibration_source, (str, Path)):
path = Path(calibration_source)
if not path.exists():
raise FileNotFoundError(f"Calibration file not found: {path}")
with open(path, 'r', encoding='utf-8') as f:
data = yaml.safe_load(f)
else:
data = calibration_source
# Unwrap 'calibration:' key if present
if isinstance(data, dict) and 'calibration' in data:
data = data['calibration']
# Normalize split format to flat
if isinstance(data, dict) and ('parameters' in data or 'settings' in data):
flat = {}
flat.update(data.get('parameters', {}))
flat.update(data.get('settings', {}))
return flat
return data
def calibrate(stage, calibration_source: Union[str, Path, dict]):
"""
Apply calibration to a stage using existing Dolo machinery.
Args:
stage: SymbolicModel (with or without inline calibration)
calibration_source: Path to calibration YAML, or dict
Returns:
CalibrationDict (existing Dolo object)
Example:
>>> from dolo import yaml_import
>>> from dolo.compiler.calibration import calibrate
>>> stage = yaml_import("stage.yaml")
>>> calib = calibrate(stage, "calibration.yaml")
>>> calib['β']
0.96
"""
from .misc import CalibrationDict, calibration_to_vector
# Load calibration data
calib_data = load_calibration(calibration_source)
# Use existing Dolo machinery to build CalibrationDict
# (reuses triangular solver, vectorization, etc.)
symbols = stage.symbols
calib_vector = calibration_to_vector(symbols, calib_data)
return CalibrationDict(symbols, calib_vector)
B. Optional: validation against @in constraints¶
If symbols_math has decorator ASTs, we can optionally validate:
def validate_calibration(stage, calib_dict):
"""
Validate calibration values against @in constraints in symbols_math.
Raises ValueError if any value is out of domain.
"""
symbols_math = getattr(stage, 'symbols_math', {})
for group in ['parameters', 'settings']:
if group not in symbols_math:
continue
for name, tree in symbols_math[group].items():
if name not in calib_dict:
continue
value = calib_dict[name]
# Extract domain from Lark tree and validate
# (implementation details depend on tree structure)
_check_domain_from_tree(name, value, tree)
C. Files to modify (summary)¶
| Package | File | Change |
|---|---|---|
dolo |
dolo/compiler/calibration.py |
NEW: thin load_calibration() + calibrate() wrapper |
No modifications to existing classes. Just a thin wrapper that calls existing CalibrationDict machinery.
D. Commit¶
Commit 3: Add calibrate() wrapper
feat(dolo): add calibrate() function for separate calibration files (v0.1c)
- Add load_calibration() to load from file or dict
- Add calibrate() that wraps existing CalibrationDict machinery
- Supports flat and split calibration formats
- No new classes — reuses existing Dolo infrastructure
Ref: spec_0.1c §4.7
7.12 Complete commit sequence (simplified)¶
| # | Package | Commit | Files |
|---|---|---|---|
| 1 | dolang | Registry infrastructure | ddsl_registry.yaml, registry.py, test_registry.py |
| 2 | dolang + dolo | Decorator parser + symbols_math |
decorator_parser.py, modify model.py |
| 3 | dolo | Calibration wrapper | calibration.py (thin wrapper) |
Total: 3 commits, no new classes.
7.13 Notes¶
- Registry location:
dolang/ddsl_registry.yaml(runtime copy, authoritative) - Backwards compatibility: all existing tests must pass; legacy formats unchanged
- No new classes: use plain dicts + Lark ASTs throughout
symbols_math: sidecar dict attached to model, keyed by bare name → Lark Tree- Calibration: thin wrapper around existing
CalibrationDictmachinery - Validation:
@inconstraints can be checked by walking Lark tree (optional)