RFD-0040 — Substrate atoms and the explicit-writes principle
Question
What are Argon’s irreducible atoms, and what rules govern their composition? This RFD names the substrate, commits to a categorical purity ladder for the four operator surfaces, and commits to the no-implicit-writes rule. Three companion RFDs ship grammar against this substrate: RFD-0041 (traits), RFD-0042 (macros), RFD-0043 (query/mutation surface), RFD-0044 (MLT cross-level instantiation).
Context
Argon’s surface has accreted distinctions through incremental implementation that the substrate does not support:
- Clause-structured mutation bodies (
require / do / retract / emit / return) that a team contributor compared to COBOL. - A
let x: T = exprform insidedo { }that silently persists when the type annotation is present and binds-locally when it isn’t — the syntax is otherwise identical. - An
emit Event { … }form that routes invisibly to one of three sinks (audit log, HITL queue, notification dispatcher); the sink is not visible at the call site. - An
Instantiatescore-IR atom locked to order-0 → order-1, with no variant supporting concept-to-concept instantiation (the MLT case). - A
::operator that lowers to meta-property assignment, not instantiation, despite reading as instance-of. - The
:token carrying five distinct meanings disambiguated by lookahead (RFD-0015). - No trait or interface surface for declaring behavioral contracts the team has been asking for.
- No macro surface, forcing higher-order patterns (state machines, derived boilerplate) into either compiler-built-in keywords or hand-written rule scaffolding.
The predicate language has always been one grammar with graduated admission (RFD-0005: rule-body, constraint, diagram contexts admit different tiers of the same predicate surface). The accretion is at the operator and writes layer, not the predicate layer.
Three inputs ground this RFD beyond the codebase itself:
- The Lean mechanization (
argon-formal/). It shows that beneath the four operator surfaces sits one stratified-fixpoint construct with three rule categories (monotone / NAF / constraint-check). The four operators are projections, not separate constructs at the formal level. The Lean is one input among several — the surface design is not obligated to match it line for line — but the collapse to one underlying mechanism is real and worth surfacing. - The Epilog technical reference. Three lexically-distinct rule arrows (
:-derivation,:=function,==>transition). Reads vs writes are categorical at the syntax level, not contextual. The discipline is proven. - The
sharpe-ontology/story branches. Team members hit the current limits and wrote in imaginary syntax. Their wish list — relation axioms, cardinality constraints, higher-order type fields, behavioral contracts on type fields — resolves to atoms-becoming-more-expressive, not to new atoms beyond the ones this RFD names.
Decision
D1 — Six atoms
Argon’s irreducible primitives are exactly six.
1. Metatype. A sealed-at-top declaration of a class of entities. pub metatype kind = { rigidity: rigid, sortality: sortal, order: 1 }. The primitive Metatype self-instantiates; every user-declared metatype is an instance of Metatype. Metatypes carry typed axes that constrain the entities they classify. Vocabularies (UFO, BFO, GFO, DOLCE) declare metatypes; the language ships no built-in vocabulary content (RFD-0002).
2. Concept. An instance of a metatype. The “type” in the ordinary type-system sense. Concepts can specialize other concepts (<:) and instantiate other concepts (:). A relation is a concept whose metatype declares arity ≥ 2; there is no separate relation atom.
3. Rule. The unified computational primitive. Has a head, a body, and a mode. The mode picks one of four surface keywords (fn, derive, query, mutation) which control purity, persistence, and call discipline. Each rule projects to one of three formal categories (monotone / NAF / constraint-check) at lowering. The Lean mechanization treats these as one construct; the surface keeps four projections for ergonomic clarity. See D2.
4. Trait. A named behavioral contract. A trait declares required fn / derive / query / mutation items (zero or more of each), optional associated types, optional associated constants, and optional default implementations for any required item. Traits can extend other traits.
pub trait Identifiable {
fn id(self) -> Text
}
pub trait Comparable <: Identifiable {
fn compare(a: Self, b: Self) -> Ordering
fn equal(a: Self, b: Self) -> Bool = compare(a, b) == Ordering::Equal // default impl
}
Implementation is a separate declaration:
impl Identifiable for Person {
fn id(self) -> Text = self.email
}
Concepts and traits are orthogonal axes. Concepts answer “what is this entity?” (ontological commitment); traits answer “what can this entity do?” (behavioral commitment). A concept implements zero or more traits; a trait is implemented by zero or more concepts.
Dispatch is static. Generic parameters bound by traits — pub fn sort<T: Comparable>(xs: List[T]) -> List[T] — monomorphize at ox compose time using RFD-0034’s pipeline and RFD-0036’s bounded-generic machinery. The kernel sees fully-specialized rule sets. No runtime trait objects in v1; no dynamic dispatch.
Traits compose with the purity ladder transparently: a trait method declared as fn is fn-pure for any caller bounded by the trait; a trait method declared as mutation makes the bounded site impure. The purity rules in D3 apply unchanged.
RFD-0041 specifies grammar, trait inheritance semantics, conflict resolution for diamond impls, coherence rules, and the interaction with RFD-0036 generic bounds.
5. Decorator. A named, parameterized shape annotation attachable to metatypes, concepts, relations, rules, or traits. Decorators carry semantics (FOL formulas or constraint rules in their bodies); the compiler’s shape recognizer fast-paths declared shapes when it can match them.
@[transitive, asymmetric, irreflexive]
pub rel parent_of: Animal -> Animal
Decorators are user-declarable in packages via pub decorator. They do not extend the type system or the rule system; they annotate. The recognizer’s job is to take FOL bodies and identify when they match a known shape (transitivity, qualified cardinality, etc.) so the reasoner can fast-path. Decorators name those shapes.
6. Macro. A compile-time AST → AST transformation. Two flavors:
- Declarative macros —
macro_rules-style pattern matching on Argon syntax. Pure-syntactic; no type access. For ergonomic sugar. - Procedural macros — compiled Rust code invoked at
oxcelaboration time. Procedural macros receive a resolved AST: type assignments, metatype assignments, decorator annotations, tier classifications. They emit any of the six atoms.
Macros expand within the per-package composition boundary established by RFD-0034. Cross-package macro use rides the existing composition wiring.
Two consequences for the surface language:
-
State machines move out of the language. The
lifecycle { … }/statemachine { … }construct (issue #321) becomes a stdlib procedural macro. The macro reads a state-machine declaration, walks phases and transitions, and emits: phase concepts with the right metatype-calculus axes, transition derives, reachability constraint rules. The kernel handles bitemporal phase tracking unchanged — it already understands phase concepts via the metatype calculus. -
Derive macros for boilerplate.
@[derive(Eq, Hash, Json)]lets concepts get standard trait impls for free. The macro reads the concept’s fields through the resolved AST and emits the appropriateimplblocks.
Macros do not have runtime access, do not see database state, and do not bypass the type system. Emitted AST re-elaborates through the normal pipeline.
RFD-0042 specifies the macro grammar, procedural-macro Rust API surface, resolved-AST format, expansion phase placement in the elaboration pipeline, and hygiene rules.
Everything else in the surface — patterns (RFD-0019), living diagrams (RFD-0026), state machines, refinement types, collection operations (RFD-0039) — is a composition of these six. Patterns are macros that emit parameterized rule scaffolds. Living diagrams are macros that emit visual-tier annotations. State machines, as noted, are macros that emit phase concepts and transition derives. Refinement types are decorators on existing concept declarations. Collections (RFD-0039) are substrate primitives at the type-system level; their methods are fn declarations like any other.
D2 — Four surface keywords, one rule primitive
The four keywords surface different modes of the same underlying rule construct. They share grammar, body shape (modulo the :- marker on derive), and elaboration path. Mode determines: purity, write-scope, which tier of the decidability ladder the body admits, and how callers invoke the rule.
| Keyword | Purity | Write scope | Body shape | Call style |
|---|---|---|---|---|
fn | pure value, deterministic given (args, state-version) | none | expression-shaped | invoked by name; returns a value |
derive | pure entailment | derived facts in the saturation closure (transient) | rule-atom-shaped (:- body) | fires automatically during saturation; not invoked by name |
query | observation of saturated state | none | shape-projection over rule atoms | invoked by name; returns a set or value |
mutation | impure | transactional + persisted to the event log | shape-projection + write verbs | invoked by name; returns a value with side effects |
derive and query walk the same rule machinery; the difference is direction. derive writes into the saturation closure; query reads from it.
The compute keyword is renamed to fn. Migration is mechanical.
D3 — The call-purity ladder
A caller can only call callees at equal-or-lower impurity.
fn — pure value (depends on args + state-version)
query — reads saturated state
derive — modifies saturation closure (deterministic given inputs)
mutation — modifies persisted state (transaction-scoped)
| Caller \ Callee | fn | query | derive | mutation |
|---|---|---|---|---|
fn | ✓ | ✓ | ✗ | ✗ |
query | ✓ | ✓ | ✗ | ✗ |
derive | ✓ | ✓ | ✓ | ✗ |
mutation | ✓ | ✓ | ✓ | ✓ (composes into one atomic transaction) |
Three semantic clarifications:
fncallingqueryis allowed. Afnis pure with respect to its arguments and the world-state version at call time. Cacheability is keyed by(args, state-version). The alternative —fnmust take all state via arguments — is unworkable for a knowledge-base language.derivecannot callmutation. Every ✗ entry in the matrix is forbidden, butderive → mutationis the load-bearing case: it’s what makes saturation termination and determinism possible. Other forbidden directions (fn → derive,fn → mutation,query → derive,query → mutation) follow from the same purity invariants but are usually caught earlier — afnbody that calls aderiveis more obviously a category error than aderivebody that calls amutation.mutation → mutationcomposes into a single atomic transaction. Commit or rollback is at the outermost call site.
Trait methods inherit the purity of their declared keyword. A <T: Trait> bound carries the most-impure method through; calling code is restricted accordingly.
D4 — The explicit-writes principle
Persistent state changes happen only through four explicit verbs: insert, update, delete, emit. emit must declare its sink.
| Today | Replacement |
|---|---|
let x: T = expr (persists) | insert x: T { … } |
let x = expr (local) | let x = expr (unchanged; never persists) |
path = value in do { } | update p set { field = value } |
retract { path: value where … } | delete p where … |
emit Event { … } | emit (audit: Event { … }) / emit (hitl: Event { … }) / emit (notify: Event { … }) |
derive head :- body | unchanged; book makes “derived facts are transient and saturation-regenerated” explicit |
A reader grepping for state changes finds them by searching for four keywords. emit carries its sink as a syntactically required prefix; there is no destination-by-runtime-routing.
D5 — Syntactic separation of instantiation and specialization
: and <: take exactly one meaning each, in all positions:
:means instantiation.a : Areads as “a is an instance of A”. Applies to individuals-of-concepts, concepts-of-concepts (D12), and statements-of-statements.<:means specialization.A <: Breads as “A is a subtype of B”. Also used for trait inheritance:trait Ord <: Eq.
The five overloaded meanings RFD-0015 allowed for : are reassigned:
| Today | Tomorrow |
|---|---|
pub kind Dog : Animal (specialization sugar) | pub kind Dog <: Animal |
field: T (field typing) | field: T (kept; instantiation at the term level) |
arg: T (function parameter typing) | arg: T (kept; instantiation at the term level) |
pub rel parent_of: A -> B (relation signature) | pub rel parent_of: A -> B (kept; the LHS is what’s being typed) |
pub kind Dog : AnimalSpecies (genuinely instantiation) | pub kind Dog : AnimalSpecies (now means what it always read as) |
RFD-0015’s disambiguation rule disappears. The auto-migration tool rewrites kind A : B to kind A <: B where B is a sibling concept (not a higher-order concept).
D6 — Block syntax: all bodies use { }
Multi-statement and multi-clause bodies use braces:
- Concept, relation, metatype, and trait bodies — braces.
fnbodies with multi-statement or non-trivial expression content — braces.derivebodies —:-followed by braces. The:-carries the semantic claim (“this is an entailment rule”); the braces carry the visual structure.query,mutation, andimplbodies — braces.- Decorator bodies (when they carry semantics) — braces.
One exception: trivial single-expression fn bodies admit the = expr shorthand.
pub fn rent_per_day(l: Lease) -> Money = l.monthly_rent / 30
Single-expression derive / query / mutation bodies do not get the shorthand.
D7 — Decorator conventions
One @[ … ] block immediately preceding the declaration, with multiple shape claims comma-separated inside.
@[transitive, asymmetric, irreflexive]
pub rel parent_of: Animal -> Animal
Not after the declaration. Not split into multiple @[…] blocks. Decorators with arguments use parentheses: @[axiom("a3"), bounded_order(3)].
D8 — Statement separators
Inside { } bodies:
;separates statements infn,query,mutation,impl, and trait-method bodies.,separates atoms inside:- { … }derive bodies. The comma is logical conjunction.
Mixing separators is a syntax error.
D9 — Multi-supertype and multi-instantiation: line-broken
Single specialization, single instantiation, or one each — inline:
pub kind Husky <: Dog
pub kind Dog : AnimalSpecies
pub kind Husky <: Dog : DogBreed
Multiple supertypes or multiple instantiations — line-broken, one clause per line:
pub kind Husky
<: Dog, WorkingAnimal
: DogBreed, RegistrationCategory
Inline comma-lists with both <: and : clauses on one line are a syntax error.
D10 — Query syntax: select without from
The target concept is the first non-keyword token after select. There is no from keyword.
pub query active_leases(t: Person) -> [Lease] {
select Lease { id, monthly_rent, end_date }
where .tenant = t and .status = Active
order by .end_date
}
Aggregates over derived sets use a with-binding:
pub fn active_monthly_total() -> Money {
with active := select Lease where .status = Active;
select sum(active.monthly_rent)
}
D11 — Scenario blocks for hypothetical writes
query bodies admit no top-level write verbs. The escape hatch is the scenario { } block:
pub query rent_under_proposed_increase(percent: Real) -> Money {
scenario {
update Lease set { monthly_rent = .monthly_rent * (1 + percent / 100) }
where .status = Active;
};
with active := select Lease where .status = Active;
select sum(active.monthly_rent)
}
scenario { … } creates a kernel fork (RFD-0023) at the moment of entry, with the query’s current state as the fork’s base. Writes inside the block apply to the fork. When the block closes (};), the fork’s state replaces the query’s “current state” for the remainder of the query body — subsequent statements (including any final select) read against the forked state. The fork is discarded when the query body completes (the query returns or implicit-returns); writes never propagate to persistent storage.
Multiple scenario blocks in a single query body are chained: each scenario forks from the state including all previous scenarios’ writes. To branch from a known earlier point, factor the shared setup into a with-binding before any scenario begins. Nested scenarios are a syntax error (OE0836). scenario does not appear in mutation bodies.
D12 — Cross-level instantiation (MLT) via a single IR atom
The Core IR atom Instantiates { individual: String, concept_id: u64 } is replaced by:
#![allow(unused)]
fn main() {
InstanceOf { instance: EntityRef, type_concept: u64 }
}
where EntityRef = Individual(u64) | Concept(u64) | Statement(u64). One atom for all instantiation, parameterized by what kind of entity is being instantiated. Tonto’s kind Dog (instanceOf AnimalSpecies) and Argon’s pub kind Dog : AnimalSpecies both lower to one InstanceOf atom with instance: Concept(dog_id).
Order is derived from the instantiation closure, not stamped at declaration: order(c) = 1 + max(order(x) for x in instances_of(c)), where instances_of(c) = { x | x : c } (the direct instances of c), starting from individuals at order 0. Types are strictly higher-order than their instances (Carvalho & Almeida 2017); see RFD-0044 D3 for the full derivation. Vocabularies can decorate metatypes with @[bounded_order(n)] or @[order(=2)] to assert a level; the elaborator validates against the derived value.
RFD-0044 covers grammar, the new elaboration phase, and the Lean extension.
Rationale
Six atoms, not five. The previous draft folded traits into “future work.” Traits are not a composition of the other five: not metatypes (a trait is not a category of entities), not concepts (a trait carries no ontological commitment), not rules (a trait declares signatures, not bodies), not decorators (decorators annotate; traits contract), not macros (macros generate code; traits constrain it). They are a distinct atom. The team’s wish list — behavioral contracts on type fields, polymorphic generic bounds, derive-style boilerplate — requires the atom.
Concepts and traits are orthogonal axes. This is the load-bearing claim that keeps Argon distinct from OO inheritance. Concepts answer “what is this” — UFO/BFO-classified ontological entities. Traits answer “what can this do” — Rust-style behavioral contracts. A Person concept can implement Identifiable, Comparable, Json, without those traits leaking into Person’s ontological commitments. The metatype calculus governs concepts; trait bounds govern generic-parameter resolution.
Macros need metatype-calculus visibility. State machines today require compiler-owned guarantees (phase disjointness, reachability checking, bitemporal tracking) because the metatype calculus is only available to the compiler. The right fix is to expose the metatype calculus to procedural macros so user-defined patterns can require the same guarantees. State machines then move to a stdlib macro — the compiler no longer has privileged knowledge of state machines, but the kernel still recognizes phase concepts because the metatype calculus does.
The four keywords stay despite collapsing to one rule. The Lean shows one underlying construct; the surface keeps four projections because the keyword IS the load-bearing reader signal. query foo() declares no side effects to a reader before they read the body; mutation bar() declares them. Collapsing the keywords erases this signal. The four-keyword surface preserves it while honoring the single-construct substrate.
compute → fn. RFD-0015 chose PL idiom; fn is more PL-idiomatic than compute. Every PL written in the last twenty years uses fn or a near-analog. Semantic content is unchanged; only spelling.
Explicit writes via four verbs. The implicit-writes audit found four mechanisms by which today’s surface persists state without visible signal at the call site. Categorical fix: every persistent write uses one of four keywords; every external effect uses emit with explicit sink. The Epilog discipline (lexical mark distinguishes pure from effectful at every site) is the proven model.
: / <: separation. The five overloads of : produced five places where readers had to slow down. Math notation is unambiguous. The cleanup unblocks MLT (D12), which uses : for cross-level instantiation in the same syntactic role.
:- and braces both. :- is semantic (this is an entailment rule); { } is visual structure. Keeping both costs one extra character per derive vs the bare-body alternative and gives readers two signals where one might be ambiguous.
Consequences
This RFD is the philosophical commitment. Grammar is in the companion RFDs:
- RFD-0041 — Traits and behavioral polymorphism. Grammar for
pub traitdeclarations,impl Trait for Conceptblocks, trait inheritance (<:), associated types, generic bounds (<T: Trait>), multiple bounds, default-method overriding, coherence rules. Closes the team’s “implement trait feature” TODOs in the workflow ontology. - RFD-0042 — Macros with metatype-calculus visibility. Grammar for declarative macros (
macro_rules-style), procedural macros (compiled Rust API surface), macro invocation syntax, hygiene rules, expansion-phase placement, the resolved-AST format procedural macros receive. State machines move to a stdlib macro; pattern declarations (RFD-0019) become macros. - RFD-0043 — Query and mutation surface.
select / insert / update / delete / with / emit / scenariogrammar replacingrequire / do / retract / emit / return. Purity-violation diagnostic codes. The full syntax ofscenarioblocks. - RFD-0044 — Cross-level instantiation (MLT). Generalizes
InstantiatestoInstanceOf, adds the:declaration form for concept-of-concept, derives order, extendsTypeGraphMLT.lean.
Breaking changes. Adoption is a major version bump — argon 1.0. Migration includes:
compute→fn.pub kind A : B→pub kind A <: B.let x: T = exprinsidedo { }→insert x: T { … }.path = valueinsidedo { }→update p set { field = value }.retract { … }→delete p where ….emit Event { … }→emit (audit|hitl|notify: Event { … }).require / do / retract / emit / returnclause structure →{ … }body with statements (RFD-0043).- Decorator-after-declaration → decorator-before-declaration single block.
lifecycle { … }/statemachine { … }→ stdlib macro (no language change required after RFD-0042 lands).
The compiler ships ox migrate for mechanical rewrites (items 1–6, 8). Item 7 requires manual review; the tool flags every site. Item 9 lands when the stdlib macro ships. Old syntax remains parseable under --edition=0.4 for one minor version; argon 1.1 removes it.
Documentation. Book chapters covering operator distinctions, clause structure, and the : / <: disambiguation are restructured around the six atoms. The “For Agents” reference is rewritten against the new surface.
Predicate grammar consolidation. RFD-0005’s “one grammar, multiple contexts” stays the design. RFD-0043 confirms the rule-body, constraint, and diagram contexts admit the same predicate grammar with graduated tier ceilings. Any surface-level fragmentation that has crept in gets cleaned up there.
Wire types. oxc-protocol bumps in two places: Instantiates → InstanceOf (additive enum variant), and mode tags added to rule declarations. SDK regeneration via oxc-codegen is mechanical.
The Lean is extended with the instantiates relation and order theorem (per D12 and RFD-0044). The rule machinery, fixpoint, and decidability theorems are reused unchanged.
Historical lineage
- RFD-0002 (foundational-ontology neutrality) — committed. Prerequisite for the six-atom model.
- RFD-0005 (one grammar, multiple contexts) — committed. The predicate-graduation design this RFD respects.
- RFD-0006 (three compute forms) — committed. Survives as
fnbody shapes after the rename. - RFD-0007 (first-class queries, mutations, provenance) — committed. Queries and mutations remain first-class items; provenance preserved.
- RFD-0015 (PL idiom over proof-assistant idiom) — committed. Extended:
:overload replaced by:/<:separation. PL-idiom principle reinforced bycompute → fn. - RFD-0019 (patterns are first-class) — committed. Reframed: patterns become a macro library.
- RFD-0023 (kernel API v2) — committed. Fork machinery used by
scenarioblocks is already shipped. - RFD-0033 (test mutate statements) — discussion. Compatible;
mutatein test blocks is amutationcall. - RFD-0034 (composition pipeline) — committed. Macros expand within its boundary.
- RFD-0036 (generic declarations) — discussion. Extended by RFD-0041 (trait bounds).
- RFD-0037 (AFT truth-value semantics) — committed. Unchanged.
- RFD-0039 (collections and collection operators) — committed. Unchanged.
This RFD extends, reframes, or refines each cited RFD. It does not supersede any committed RFD wholesale.