Keyboard shortcuts

Press or to navigate between chapters

Press S or / to search in the book

Press ? to show this help

Press Esc to hide this help

RFD-0042 — Macros and compile-time code generation

Discussion Opened 2026-05-17

Question

How does Argon expose user-extensible compile-time code generation? This RFD specifies the macro system named as atom #6 in RFD-0040. State machines, derive-style boilerplate, and parameterized rule scaffolds all flow through it.

Context

Today’s surface bakes higher-order patterns into the compiler. State machines have a dedicated lifecycle { … } syntax that desugars in compiler-owned code to phase concepts + transition derives + reachability constraints. Patterns (RFD-0019) have their own first-class declaration form. Living diagrams (RFD-0026) sit beside both.

The compiler-owned approach has two costs:

  • Every higher-order pattern requires compiler changes.
  • The set of available patterns is closed.

The first cost has been paid repeatedly (lifecycle/statemachine, patterns, diagrams). The second cost shows up in the sharpe-ontology/ branches: the team reaches for behavioral contracts on type fields, for declarative ordering lattices in metaxis declarations, for change-modeling templates — each of which would be a new keyword today.

Macros invert this. The compiler ships the six substrate atoms (RFD-0040); package authors write macros that compose them into higher-order constructs. State machines move to a stdlib macro. Pattern scaffolds become macros. @[derive(Eq, Hashable)] becomes a macro.

The Rust analogy is direct. Rust’s #[derive(...)] and macro_rules! cover the common cases; procedural macros handle the rest. The two-flavor design (declarative + procedural) is well-trodden. Argon adopts it, with one significant addition: procedural macros see the resolved AST, including metatype-calculus annotations, not just token streams.

Decision

D1 — Two flavors

Argon ships two macro forms.

Declarative macros (macro!-style):

pub macro vec {
    () => { Vec::empty() };
    ($head:expr, $($tail:expr),* $(,)?) => {
        Vec::cons($head, vec!($($tail),*))
    };
}

let xs = vec!(1, 2, 3, 4);

Pattern-matching on token sequences. No type information; no metatype-calculus access. Used for syntactic sugar.

Procedural macros (#[macro]-style):

#![allow(unused)]
fn main() {
// In a Rust crate inside the package.
#[argon_macro]
pub fn expand(input: ResolvedAst) -> Result<TokenStream, MacroError> {
    let fields = input.fields();
    let body = fields.iter().map(|f| quote! {
        a.#f == b.#f
    }).reduce(|x, y| quote! { #x && #y });
    Ok(quote! {
        impl Eq for #input.name {
            fn equal(a: Self, b: Self) -> Bool = #body
        }
    })
}
}

Compiled Rust code. Receives a resolved AST input: type assignments, metatype annotations, decorator annotations, tier classifications. Emits Argon AST via a TokenStream-equivalent that re-elaborates through the normal pipeline. The canonical signature (ResolvedAst in, Result<TokenStream, MacroError> out, #[argon_macro] attribute) is spelled out in D4.

D2 — Invocation

Declarative macros invoke with !:

let xs = vec!(1, 2, 3);
println!("hello {}", name);

Procedural macros invoke as decorators (single block, before declaration, per RFD-0040 D7):

@[derive(Eq, Hashable, Json)]
pub kind Person {
    name: Text,
    email: Text,
}

The two surfaces are distinct because the cost is different: declarative-macro expansion is purely syntactic and cheap; procedural-macro expansion runs Rust code at compile time and is opt-in. The ! syntax flags an expression-level expansion; the @[…] syntax flags a declaration-level transformation.

Procedural macros can also be invoked at expression position with @[…]! syntax for inline cases:

let report = @[generate_audit_summary(period="Q3")]!;

This is the same procedural-macro mechanism; only the invocation position differs.

D3 — Expansion phase

Macros expand at oxc elaboration time. The pipeline iterates: an initial pass converges the metatype calculus on user-written declarations, macros then expand with full metatype-calculus visibility, and emitted declarations re-enter the pipeline for their own name-resolution + metatype-calculus pass. The phase order:

                ┌───────────────────────────────────────────┐
                │  parse → name resolution → metatype       │
                │  calculus (converge on user-written)      │
                └──────────────────┬────────────────────────┘
                                   │
                                   ▼
                       ┌───────────────────────┐
                       │  MACRO EXPANSION      │  ← procedural macros
                       │  (resolved AST in,    │    see resolved AST
                       │   AST tokens out)     │    with metatype info
                       └──────────┬────────────┘
                                  │
                  ┌───────────────┴─────────────┐
                  │ any emitted declarations?    │
                  └───┬──────────────────────┬──┘
                      │ yes                  │ no
                      ▼                      ▼
            re-enter name resolution +    type check → lowering
            metatype calculus, then
            re-run macro expansion
            (up to recursion limit)

Declarative macros expand purely syntactically (no resolved AST input; output is raw tokens) and so can run in the initial pass before the metatype calculus converges if no procedural macro is attached to the surrounding declaration. Procedural macros always run after the metatype calculus.

Procedural macros receive the resolved-AST input for the declaration they’re attached to and the surrounding scope (visible declarations, in-package metatypes with computed axes, in-package trait impls, recognized shapes). They emit AST that re-enters the pipeline at the name-resolution phase — emitted declarations go through name resolution, metatype calculus, and subsequent macro expansion.

The expansion is hermetic within the per-package boundary established by RFD-0034. Cross-package macros work because the consuming package’s ox compose walks the dependency graph and runs each package’s macros at its oxc step. Termination is guaranteed by the macro recursion limit (D8) and by the requirement that emitted AST not re-attach the same procedural macro to the same declaration.

D4 — Procedural-macro API surface

Procedural macros are Rust crates inside the Argon package, marked in ox.toml:

[macros]
derive_eq = { path = "macros/derive_eq", kind = "procedural" }
statemachine = { path = "macros/statemachine", kind = "procedural" }

The macro crate depends on oxc-macro-api:

#![allow(unused)]
fn main() {
use oxc_macro_api::{ResolvedAst, TokenStream, MetatypeRef, ConceptRef};

#[argon_macro]
pub fn expand(input: ResolvedAst) -> Result<TokenStream, MacroError> {
    // Inspect input.metatype(), input.fields(), input.decorators()
    // Build output via the quote! macro
    Ok(quote! { … })
}
}

The ResolvedAst type carries:

  • The declaration the macro is attached to (with its resolved metatype, decorators, fields).
  • Read-only access to in-scope declarations (other concepts, traits, fns, metatypes).
  • Read-only access to the metatype calculus (axis values, computed orders, recognized shapes).
  • A diagnostic emitter (input.emit_error(span, code, message)).

TokenStream is the same shape as Rust’s, scoped to Argon’s syntax. The quote! macro inside the Rust crate lets authors write Argon-source-shaped templates with #identifier interpolation.

D5 — Hygiene

Macros are hygienic by default. Identifiers introduced by a macro do not collide with identifiers in the surrounding scope. A macro that expands to:

let x = compute_thing();
return x + outer_var;

binds a fresh x that does not shadow any caller-scope x. outer_var resolves in the macro-call site’s scope (where the macro was invoked), not in the macro-definition site (where the macro was written).

Authors can opt out of hygiene with unhygienic! blocks when intentional capture is needed (rare; mostly for debugging).

D6 — What macros emit

A macro can emit any of the six substrate atoms (RFD-0040 D1):

  • New metatypes (rare; for vocabulary extensions).
  • New concepts.
  • New rules (fn, derive, query, mutation).
  • New traits.
  • New decorators.
  • New macro invocations (recursive expansion).

Emitted AST is re-elaborated through the normal pipeline. A macro that emits a pub kind declaration produces a concept that the metatype calculus sees, that downstream rules can reference, that other macros can introspect.

D7 — What macros do not do

  • No runtime access. Macros run at compile time; they cannot read the live database or query the deployed kernel.
  • No bypass of the type system. Emitted AST goes through full elaboration. A macro that emits ill-typed code produces compile errors at the emit site.
  • No bypass of the metatype calculus. A macro that emits a concept with bogus metatype assignments fails metatype validation like any hand-written concept.
  • No bypass of the purity ladder. A macro that emits a fn body calling a mutation produces an OE0840 like any other purity violation.
  • No filesystem access by default. Procedural macros run in a sandboxed Rust environment. File reads require an explicit ox.toml capability grant; build reproducibility requires the read paths to be content-addressed inputs.

D8 — Recursion and termination

Declarative macros are bounded by a recursion depth (default 128). Procedural macros are bounded by wall-clock time per invocation (default 30 seconds). Both limits are configurable per package via ox.toml.

Mutual recursion between macros is admitted within the depth limits. The compiler emits OE0871 MacroRecursionLimit if exhausted.

D9 — Spans and diagnostics

Every token emitted by a macro carries a span that points back to the macro invocation site. Diagnostics on macro-emitted code report:

  1. The error at the emitted location.
  2. The macro invocation that produced the location.
  3. The macro definition site.

The diagnostic format:

error[OE0123]: type mismatch
  --> src/leases.ar:42:5
   |
42 |     @[derive(Eq)]
   |     ^^^^^^^^^^^^ macro invocation
   |
   = note: emitted by macro `derive_eq` at lib/std/derive.ar:18:4
   = note: expected `Text`, found `Nat` (in synthesized impl body)

D10 — Raw-token attribute macros (the DSL-in-body case)

Some procedural macros embed a domain-specific language whose syntax is not valid Argon. The canonical case is @[statemachine] (D11) which wants phases { … } and transitions { … } blocks inside the annotated declaration’s body — neither is admitted by RFD-0040 D6’s kind-body grammar.

For these cases, procedural macros declare a raw-token mode in their Rust definition:

#![allow(unused)]
fn main() {
#[argon_macro(raw_tokens)]
pub fn expand(item: RawItemTokens) -> Result<TokenStream, MacroError> {
    // item.attributes() — decorators on the declaration (resolved)
    // item.head()       — the kind/category/relation/etc. header (resolved)
    // item.body()       — the body as a TokenStream, NOT parsed as Argon
    // ...
}
}

The parser’s behavior changes when a raw-token-mode macro is attached to a declaration:

  1. Parse the declaration’s head (pub kind Lease, pub category Foo, etc.) as usual.
  2. Tokenize the body without attempting to parse it as Argon. Deliver as a TokenStream.
  3. Skip name resolution and the metatype calculus for the declaration until the macro emits its expansion.
  4. After the macro emits, the emitted AST goes through the normal pipeline (parse → name resolution → metatype calculus → recognize).

This mirrors Rust’s #[proc_macro_attribute]: the macro receives a raw token stream of the item, parses it itself, and emits valid Rust. Raw-token mode is opt-in per macro; default procedural macros (without raw_tokens) receive the fully resolved AST per D4.

The macro author is responsible for parsing the DSL with their own parser (e.g., using syn and quote adapted for Argon tokens). Diagnostics emitted from the macro’s parsing carry spans pointing at the original source locations.

D11 — State machines as a macro (using raw-token mode)

The statemachine macro is the canonical demonstration of D10. It declares raw_tokens mode and reads a declaration like:

@[statemachine]
pub kind Lease {
    phases {
        Pending,
        Active,
        Expired,
        Terminated,
    }

    transitions {
        Pending -> Active   { on: signed },
        Active  -> Expired  { on: term_ended },
        Active  -> Terminated { on: terminated },
    }
}

The kind body contains phases { … } and transitions { … } blocks — not valid Argon kind-body syntax under RFD-0040 D6. Raw-token mode (D10) lets the macro receive the body as a TokenStream and parse it with the macro’s own parser. The macro then emits valid Argon:

  1. A phase concept for each phase (Pending, Active, Expired, Terminated) — each with pub phase metatype assignment.
  2. A derive rule for each transition, firing on the event.
  3. Reachability constraint rules from the transition graph.
  4. A current_phase(l: Lease) -> Phase fn returning the current phase from bitemporal kernel state.

The kernel — which already understands the phase metatype via the metatype calculus — handles bitemporal phase tracking unchanged. The compiler has no special knowledge of state machines. Issue #321 closes as “no language change required; package implementation.”

D12 — Patterns as macros

RFD-0019 patterns become a stdlib macro family. A pattern declaration:

@[pattern]
pub kind Subscription[Subject, Object] {
    holder: Subject,
    target: Object,
    started_at: DateTime,
    canceled_at: DateTime?,
}

expands to a generic concept declaration parameterized by Subject and Object. Use sites:

type GymMembership = Subscription[Person, Gym]

are macro invocations that produce concrete concept declarations.

The pattern surface that RFD-0019 specified continues to read the same; the implementation is library code in std::patterns.

D13 — Macros and visibility

Macros declared pub in a package are usable by dependents. Non-pub macros are package-local.

Declarative macros across packages are imported alongside other items:

use std::collection::vec

Procedural macros are imported via decorator path:

@[std::derive::Eq]
pub kind Person { … }

@[my_lib::statemachine]
pub kind Workflow { … }

Common stdlib macros (derive, statemachine, pattern) ship under unqualified names by convention.

D14 — Diagnostic codes

  • OE0870 MacroNotFound — invocation references a macro that doesn’t exist or isn’t in scope.
  • OE0871 MacroRecursionLimit — depth limit exhausted.
  • OE0872 MacroTimeout — procedural macro exceeded its wall-clock limit.
  • OE0873 MacroEmitInvalid — emitted AST fails parse or elaboration; nested diagnostic chain points to the failure.
  • OE0874 MacroCapabilityDenied — procedural macro attempted filesystem/network access without the ox.toml capability grant.
  • OE0875 HygieneViolationunhygienic! block leaks an identifier in a way that produces ambiguous resolution.
  • OE0876 MacroArgumentMismatch — declarative macro invocation does not match any case.

Rationale

Two flavors are necessary. Declarative macros cover the cases where syntactic pattern-matching is sufficient (vec!, println!); procedural macros cover everything else (derive_eq needs to inspect fields, statemachine needs to walk a transition graph). Either alone is insufficient — declarative-only can’t see types; procedural-only is overkill for trivia.

Procedural macros need resolved-AST access. The Rust analog stops at token streams plus syntactic inspection; that’s enough for #[derive(Eq)] because Rust’s tooling crates (syn, quote) re-parse the tokens with type-aware heuristics. Argon’s compile-time model already runs name resolution and metatype-calculus convergence before macro expansion in the proposed pipeline, so passing the resolved AST directly is cheaper and more correct. The macro author doesn’t re-implement the elaborator.

Hermetic per-package expansion. The composition pipeline (RFD-0034) compiles per-package then composes; macros run within the per-package step. Cross-package macro invocations work because the consuming package’s oxc step runs each macro the dependency declares. No global macro registry; no compose-time-only macros.

The decorator-shape invocation for procedural macros (vs !-suffix for declarative). A procedural macro is a declaration-level transformation: it takes a declaration as input, emits declarations. A declarative macro is an expression-level transformation: it takes tokens, emits tokens. The two invocation surfaces match the two scopes. @[derive(Eq)] reads as “this declaration gets the Eq impl appended”; vec!(1, 2, 3) reads as “this expression is a vec literal.”

The @[…]! form for procedural macros at expression position handles the edge case where a procedural macro should produce an expression (e.g., @[generate_audit_summary]! returning a structured value). The ! suffix flags expression-position; the @[…] flags the macro identity.

No runtime dispatch, no bytecode interpretation. Procedural macros are compiled Rust. They execute deterministically given identical inputs (subject to wall-clock limits and the sandbox). This matches the rest of Argon’s static-only-at-compose discipline.

State machines as the proving example. If the macro system can express state machines with the same guarantees the compiler-built-in lifecycle provides today, it can express anything the team currently asks for. The metatype-calculus visibility is the load-bearing capability; without it, macros could emit phase concepts but couldn’t validate disjointness or reachability without compiler help.

Filesystem capability gating. Procedural macros run untrusted code at compile time. The capability grant model (ox.toml declares “this package’s macros need read access to X”) is the standard solution; the alternative (full filesystem access) makes build reproducibility nearly impossible.

Consequences

Compiler. New iterative macro-expansion phase that runs after the metatype calculus converges on user-written declarations and re-enters the pipeline on emitted AST (per D3). New oxc-macro-api crate published alongside oxc-protocol. Procedural macros build as Rust dynamic libraries; the elaborator loads them in a sandboxed process.

Sandbox. The procedural-macro execution sandbox restricts: no network access, filesystem reads gated by ox.toml, no environment variable access, wall-clock time limit, memory limit. The implementation reuses existing Rust sandboxing primitives (e.g., wasmtime or seccomp + namespaces); selection deferred to implementation.

Stdlib macros. Initial stdlib macro set:

  • derive(Eq, Hashable, Json, Default, Display, Ord) for trait-impl generation.
  • statemachine for the lifecycle pattern.
  • pattern for RFD-0019 patterns.
  • vec!, println!, format! for declarative ergonomics.

Patterns RFD reframed. RFD-0019 stays committed; its first-class status is preserved but the implementation is a macro. The book chapter on patterns documents the macro and the recommended use cases.

State machine RFD closes. Issue #321 (lifecycle → statemachine rename) closes when the statemachine macro ships. The compiler removes the lifecycle { … } keyword in argon 1.1.

ox.toml extensions. New [macros] section listing declarative and procedural macro entries. Capability grants under [macros.<name>.capabilities].

Wire types. oxbin artifacts include a macro_expansion_provenance section recording which macros expanded which declarations. RFD-0028 diagnostics extend to carry macro-expansion chains.

Book chapter. ch04-04-macros.md covers the two flavors, invocation syntax, hygiene rules, and authoring procedural macros. ch03-05-state-machines-via-macro.md replaces the existing state-machine chapter.

Migration. No source migration required for the macro system itself — it’s additive. The lifecycle removal in argon 1.1 is mechanical (rewrite to @[statemachine]).

Historical lineage

  • RFD-0019 (patterns are first-class) — committed. Patterns become a macro library; the declaration surface continues to read the same.
  • RFD-0026 (living diagrams) — committed. Diagrams can be authored either as compiler-built-in or as a macro. Reassessed in a follow-up RFD.
  • RFD-0028 (diagnostics schema 1.0) — committed. Extended for macro-expansion provenance.
  • RFD-0034 (composition pipeline) — committed. Macros expand within its per-package boundary.
  • RFD-0040 (substrate atoms and the explicit-writes principle) — discussion. Names macros as atom #6; this RFD specifies the grammar and semantics.
  • Issue #321 (lifecycle → statemachine) — closes when the statemachine macro ships.

This RFD does not supersede any committed RFD.