RFD-0035 — Binary artifact .oxbin and the execute layer
Question
What is the wire shape of an Argon workspace’s composed output? How does a runtime backend consume it? What invariants does the artifact carry across schema evolution, content-addressing, hot replacement, and tier-bound execution?
Context
RFD-0034 ratifies the two-phase composition pipeline: ox computes wiring + composition signature; oxc instantiates per-package; ox compose merges into a workspace artifact. This RFD specifies that workspace artifact: its file format, its sections, its encoding, its versioning, its content-addressing, and the contract a runtime must honour to consume it.
The artifact is structurally without precedent. It carries simultaneously a typed knowledge container, a program (rules + queries + mutations + computes), a composed link target, a derived-state cache, a provenance archive, a tier-typed safety container, and a standpoint-lattice carrier. No surveyed system carries even four of these in one form. But the structural skeleton — sectioned binary with self-describing preamble, per-section sub-headers, reservation-based extensibility, three-axis versioning, Merkle content-addressing — has been settled by thirty years of compiled-artifact engineering across ELF, Erlang BEAM, WebAssembly, HDT, and Lean 4’s multi-file .olean.* split. RFD-0035 inherits that skeleton and specifies the Argon-shaped sections inside.
The genuine invention surface is six items, each forced by an existing settled property of Argon’s storage and reasoning model: tier-typed sectioning, standpoint-lattice serialisation, the saturation-cache + per-fact provenance combination shipped together, a Kleene–Belnap four-valued bridge handled in query interpretation, cross-package standpoint lattice composition (specified here, executed in RFD-0034), and a bitemporal-basis-T-preserving hot-replacement contract. Each is sketched in this RFD with grounded precedent.
The runtime concerns must factor cleanly so that the kernel becomes one backend among several. An in-process oxc-runtime library, an external embedded engine (e.g., oxigraph-backed), and a future sandboxed bytecode runtime are all anticipated consumers. The runtime contract is specified as a trait surface that any backend implements; the artifact is the contract’s data shape.
A research campaign preceding this RFD ([vault: research/argon-execute-layer/]) produced primary-source-grounded recommendations for every load-bearing decision. The campaign’s findings are summarised inline where relevant; this RFD ratifies twelve specific recommendations and documents six invention zones with sketched approaches.
Decision
1. File format
The artifact’s file extension is .oxbin. A workspace .oxbin is the output of ox compose; a per-package .oxc cache (RFD-0034) shares the same skeleton with a smaller scope.
Magic header: 0x00 0x6F 0x78 0x62 0x69 0x6E 0x00 0x01 — eight bytes — \0oxbin\0\1. Followed by three independent uint32 version numbers (§4) and a reserved uint32. Total file-magic + preamble: 24 bytes.
2. Section model
[file-magic + preamble: 24 bytes]
[global-control ] CBOR-encoded; section directory; composition_signature; wiring_diagram_hash; max_tier_claimed; precomputed standpoint ancestor sets
[symbol-table ] HDT-PFC front-coded string pool of qualified paths + stable_ids
[events ] CBOR streaming; sorted by (tenant_id, fork_id, standpoint_id, predicate, valid_time, tx_time)
[standpoint-lattice ] CBOR DAG; precomputed ancestor sets in global-control
[fork-lineage ] CBOR chain; copy-on-write fork structure
[tier-table ] CBOR per-rule; max_tier_claimed in global-control
[projection-cache ] array of Cap'n Proto segments, content-keyed; one per maintained projection
[arrangement-section ] optional Cap'n Proto Z-set arrangements; future-compat for DBSP
[pathology-flags ] CBOR list; compile-time-detected reasoning pathologies
[doc-blocks ] CBOR per-item; per RFD-0029 typed DocBlock
Ten standard section types, listed in canonical order in the diagram above. Five are mandatory; five are optional. Optional ≠ LAZY: optionality controls presence (the section may be omitted from the artifact); the LAZY sub-header flag controls load discipline (the runtime mmaps it on first access rather than reading it during the synchronous prefix). The two are independent. The ten sections classify as follows:
| Section | Presence | Load | Why |
|---|---|---|---|
global-control | Mandatory | Synchronous | Section directory; without it, no other section can be located. |
symbol-table | Mandatory | Synchronous | Every reference in events / rules / queries / mutations resolves through it; Layer-2 symbol resolution requires it. |
events | Mandatory | Synchronous (mmap’d) | Source of truth for axioms + retractions + derived events; cannot be reconstructed. Mapped at startup; the body is read incrementally on query. |
standpoint-lattice | Mandatory | Synchronous | Every event row carries a standpoint_id; the lattice is needed to interpret ancestor sets. An artifact with a single default standpoint still carries a one-node lattice. |
tier-table | Mandatory | Synchronous | Carries per-rule tier annotations consumed by Layer-2’s tier-consistency sub-pass (§6). Layer 1’s check reads only the max_tier_claimed uint8 from global-control; the tier-table itself is a Layer-2 input. The section’s MANDATORY sub-header flag enforces presence at load time so Layer 2 can run without optional-section degradation (§8). |
fork-lineage | Optional | Synchronous when present | Artifacts without forks (most published packages) omit it. Small enough that lazy-loading is not worthwhile when present. Empty present-form is also valid. |
projection-cache | Optional | LAZY (mmap on first access) | A maintained derivation of the canonical events; the runtime saturates from rules + ABox if the section is absent. Re-add at any time without semantic change. |
arrangement-section | Optional | LAZY (mmap on first access) | Forwards-compatibility for DBSP-shaped Z-set arrangements; absent in DRedc-only artifacts. |
pathology-flags | Optional | Synchronous when present | Small list of compile-time-detected reasoning pathologies. Read at startup so downstream behaviour can adapt; absence is equivalent to “no detected pathologies.” |
doc-blocks | Optional | LAZY (per-item on first deserialise) | A .oxbin produced by ox build --strip-docs omits doc strings. The runtime contract works without them; tools like ox doc require them and refuse such artifacts. |
Section types are numbered as a single uint8 namespace:
0..99— standard sections (the ten listed above; specific type-id assignments live inoxc-protocol::core::oxbin).100..199— Argon-future sections (debug-info, traces, semver-check metadata).200..255— user-custom (third-party tooling). Unknown sections in this range are ignored on load (WASM custom-section discipline) unlessMANDATORY-flagged. This range is the extension hook for tools that want to attach metadata to a.oxbinwithout coordinating with the Argon project.
Each section has a sub-header carrying:
section_type: uint8— number from the registry above.flags: uint8— bit 0 =MANDATORY(load fails if section type is unknown; mirrors Mach-OLC_REQ_DYLD); bit 1 =LAZY(mmap on first access); bit 2 =CONTENT_HASHED(content_hash field is populated); bits 3-7 reserved.content_hash: [u8; 32]— SHA-256 of the section’s encoded body whenCONTENT_HASHED.size: uint64— encoded body size.offset: uint64— byte offset from start of file.
The section directory in global-control indexes every section. The order in the directory is the canonical load order.
3. Encoding — layered
- Outer container + structural sections (
global-control,events,standpoint-lattice,fork-lineage,tier-table,pathology-flags,doc-blocks): CBOR per RFC 8949, with the §4.2.1 Deterministic Encoding discipline (shortest argument forms, definite-length, lex-sorted map keys, preferred float representations). Deterministic CBOR is what makes content-addressing well-defined. - Hot mmap’d cache + Z-set segments (
projection-cache,arrangement-section): Cap’n Proto. Zero-copy load; mmap-friendly; schema-evolution by additive numeric field IDs. - Symbol table: bespoke HDT-PFC (Header–Dictionary–Triples Plain Front Coding). Optimised for the high-prefix-redundancy of qualified Argon paths (
pkg::module::ItemA,pkg::module::ItemB, …).
CBOR tags 65536, 65537, 65538 are claimed for Argon’s PosBool(M) provenance witnesses (axiom_id, rule_id, module_id respectively). These tags fall in the First Come First Served range per RFC 8949 §9.2 — the FCFS range is contiguous from 32768 upward, with the 65536 boundary marking only an encoding-width change (4-byte argument vs 2-byte), not a registration-policy change. There is no “Private Use” range for CBOR tags. Argon’s use of these numbers is therefore a deliberately deferred FCFS application — a lightweight registration that Argon will submit to IANA before the .oxbin format graduates to a public-stable surface. Until the format is public, the small risk that another party registers these specific numbers for incompatible semantics is mitigated by the format’s internal-only deployment plus the major-version-bump migration path. Tag numbers may be reassigned only via a major-version bump of oxbin_format_version.
4. Versioning — three axes + multi-release overlay
The preamble carries three independent uint32 version axes:
oxbin_format_version— section model + encoding evolution. Bumped on additive section types (minor) or breaking section semantics (major).tier_ladder_version— the decidability tier vocabulary itself. Bumped when the tier ladder changes (a tier renamed, added, removed). Today: the ladder defined in RFD-0004. Major bump only.kernel_api_version— the kernel’s HTTP API surface (the resource-oriented v2 model the artifact is composed against). Bumped when the API contract changes. Minor on additive surface; major on breaking.
The robustness principle: producers strict, consumers liberal. A consumer accepts future minor bumps on any axis unless the new artifact carries a MANDATORY-flagged section the consumer doesn’t recognise; in that case load fails with OE1003 UnknownMandatorySection.
Within a major version on oxbin_format_version, the artifact may carry a multi-release overlay — e.g., both a projection-cache (DRedc-shaped, today’s default) and an arrangement-section (DBSP-shaped, future) in parallel. Each runtime picks the section it understands. The pattern is JEP 238 (multi-release JAR) at the section level instead of the file level.
5. Content-addressing — Merkle-tree-of-sections
Per-section content_hash: [u8; 32] (SHA-256 over the encoded body) populates the CONTENT_HASHED field for sections where it is meaningful — events, standpoint-lattice, fork-lineage, tier-table, projection-cache (per-segment), arrangement-section (per-segment), doc-blocks. The artifact-level hash is the SHA-256 over (composition_signature ∥ section_directory_canonical_form), where composition_signature is the value RFD-0034 specifies.
Two semantically-equivalent compositions produce byte-equivalent artifacts. The deterministic-CBOR discipline (§4.2.1) plus per-section canonical-form rules (lex-sorted entries within each section’s logical structure; explicitly specified per section in oxc-protocol::core::oxbin) guarantee this.
Selective cache invalidation follows: a TBox change touches only the events section’s content hash; the lattice and tier-table hashes are unchanged. Downstream caches (the kernel’s segment cache, peer caches in deployment, Bazel/Nix-style build caches) invalidate at section granularity.
6. Validation — hybrid Layer 1 + Layer 2
Layer 1 (closed-boolean, WASM-style). Computed at load time after global-control is parsed and before any other section body is read:
tier_valid(.oxbin, runtime) := max_tier_claimed ≤ runtime.max_tier_supported
max_tier_claimed is a uint8 in global-control denoting the highest tier of any rule in the artifact. If Layer 1 fails, the runtime refuses the artifact with OE1004 TierMismatch. The check is O(1).
Layer 2 (per-section verifier, JVM-style). Run after Layer 1 passes; per-section invariants:
- Symbol resolution — every reference in events / rules / queries / mutations resolves to a symbol in
symbol-table. - Lattice acyclicity —
standpoint-latticeis a DAG. - Provenance well-formedness — every
derivation_provenanceJSONB blob is a valid PosBool(M) DNF (lex-sorted clauses; lex-sorted witnesses within each clause). - Composition-signature consistency — the runtime recomputes
composition_signaturefrom the four stored sub-hash legs (global-control.wiring_diagram_hash,section_directory["standpoint-lattice"].content_hash, the preamble’stier_ladder_version, the preamble’skernel_api_version) per the formula in RFD-0034 §2 and verifies the result matches the storedglobal-control.composition_signature. The wiring diagram source itself is not preserved in the artifact (it’s a transient build-time input);ox composeis the build-time check that the wiring is correct, and the runtime is the load-time check that the artifact’s stored sub-hashes have not been tampered with. - Tier consistency — every rule’s tier in
tier-tabledoes not exceedmax_tier_claimed. - Doc-block well-formedness — every
DocBlockper RFD-0029 has resolvable cross-links.
Layer 2 failures surface as OE1005..OE1010 (one code per invariant family).
The default lenient/strict policy is per runtime kind:
- Sandboxed runtimes (future bytecode runtime, untrusted-content) default strict — any Layer 2 failure refuses the load.
- In-process
oxc-runtimedefaults lenient — Layer 2 failures log warnings and skip the offending section. The kernel andox checkuselenient;ox buildfor distribution usesstrict.
A --validate=strict|lenient flag on ox compose and on each runtime’s load surface overrides the default.
Strict mode forces eager validation of every Layer-2 invariant, including those for sections marked LAZY in §2. The “refuse the load on any Layer-2 failure” promise must be honourable, which means a runtime in strict mode cannot defer doc-block well-formedness checks to first deserialisation: it eagerly walks every doc-block at load time, paying the one-time cost so the contract is real. §7 specifies the lazy-by-default load order; the strict-mode override flips the lazy steps synchronous for validation purposes only (the underlying sections still mmap as before; the runtime simply runs each section’s Layer-2 sub-pass before returning success). Lenient mode keeps the lazy schedule and surfaces failures as OW1011 when they’re discovered later.
7. Streaming load order
Layer-2’s per-section invariants depend on the sections being loaded — symbol resolution needs the symbol table, lattice acyclicity needs the lattice, and so on. Layer 2 is therefore not a single pass after step 3; it splits into per-section sub-passes interleaved with loading, each running immediately after its dependent section is in memory. This matches the natural reading of “per-section verifier” (JVM-style) and avoids the load-before-validate ordering problem.
1. Synchronous: file-magic + preamble.
2. Synchronous: global-control + section directory.
3. Layer-1 validation (tier check, O(1)). ← fails fast on tier mismatch
4. Synchronous: symbol-table. ← front-coded; fast
5. Synchronous: events (mmap'd; ready for prefix-scan).
Layer-2 sub-pass: symbol resolution + provenance well-formedness.
6. Synchronous: standpoint-lattice + fork-lineage + tier-table + pathology-flags.
Layer-2 sub-pass: lattice acyclicity + tier consistency + composition-signature consistency.
7. Lazy: projection-cache segments. ← mmap on first miss; per-segment CONTENT_HASHED
8. Lazy: arrangement-section (if present).
9. Lazy in lenient mode / synchronous in strict mode: doc-blocks — per-item lazy-deserialise (Idris 2 `ContextEntry` pattern). Layer-2 sub-pass: doc-block well-formedness, run on first deserialisation of each block in lenient mode, run eagerly across every block before load returns in strict mode (per the §6 strict-mode override).
10. Ignored: user-custom sections in the `200..255` range (unless `MANDATORY`-flagged).
The synchronous prefix is steps 1-6 in lenient mode; strict mode extends the synchronous prefix to include the lazy Layer-2 sub-passes (step 9 in particular). Heavy payloads (projection cache, arrangements) remain mmap’d-on-first-access regardless of mode — they don’t have Layer-2 invariants the validator needs to walk eagerly. Cold-start cost in lenient mode is dominated by steps 1-6; strict mode adds the doc-block traversal cost at the front. Empirical target: sub-25 ms cold start on the lease-story workload in lenient mode; strict-mode cost is bounded by doc-block count and verified via benchmark harness.
Layer-2 failure within a sub-pass is handled per the lenient-vs-strict policy in §6. Strict runtimes refuse the load on the first sub-pass failure (including doc-blocks per the §6 override); lenient runtimes log the failure, skip the offending section’s logical content, and continue, surfacing the failure as OW1011 when it’s encountered.
8. Tier-typed sectioning
Per-rule tier annotations live in the tier-table section: tier_table[rule_id] = tier, where tier is a uint8 enum into the tier ladder defined by RFD-0004. The maximum across all rules is global-control.max_tier_claimed.
A runtime advertises its capabilities via runtime.max_tier_supported. Layer-1 validation refuses an artifact whose max_tier_claimed > max_tier_supported. Sandboxed runtimes that cannot evaluate tier:fol content thus refuse FOL artifacts at the doorstep, before any rule body is parsed.
The tier-table section itself is MANDATORY-flagged: a runtime that does not understand the section refuses the load. This is necessary because Layer 2’s tier-consistency sub-pass (§6) verifies that every rule’s per-rule tier annotation does not exceed max_tier_claimed; without the section the sub-pass cannot run and the artifact-runtime tier contract is unverifiable. (Layer 1 itself reads only the max_tier_claimed uint8 from global-control; Layer 2 is what consumes the tier-table.)
Per-section tier filtering is reserved for a future cut. Today the granularity is artifact-wide (max_tier_claimed); per-section tier (e.g., tier:closure rules in one section, tier:fol rules in another, with the runtime loading only the closure section) is a v2 extension.
9. Standpoint-lattice serialisation
The standpoint-lattice section serialises the workspace’s standpoint DAG: a flat list of standpoint records ({id: UUID, name: String, parents: [UUID]}) plus precomputed ancestor sets ({standpoint_id: UUID, ancestors: [UUID]}), the latter cached in global-control for O(1) lookup at query time.
Every event row in the events section carries a standpoint_id: UUID column. Cross-standpoint queries traverse via WHERE standpoint_id = ANY($ancestors) against the precomputed ancestor set — no recursive CTE at query time.
Cross-package lattice composition (when ox compose merges N packages each declaring standpoints) follows RFD-0034’s rules: standpoints from different packages identify only by UUID (never by name); union the lattices; reject cycles. The composed lattice is a DAG by invariant.
This is the single most novel section in the artifact. No surveyed system persists N standpoints in one binary form with efficient cross-standpoint indexing. Strass et al. 2023’s Standpoint EL+ Soufflé prototype encodes the lattice in a 5-ary gci_nested(t, b, s, c, d) Datalog form but explicitly admits scalability issues; Argon’s pre-materialised ancestor sets are exactly the optimisation that paper invites.
10. Saturation cache + per-fact provenance combination
The events section is canonical: every axiom + every retraction + every derived event carries its derivation_provenance JSONB inline (PosBool(M) DNF, per RFD-0007 and the workspace’s settled provenance convention). The events section is the source of truth for both factual content and provenance.
The projection-cache section is derived — one Cap’n Proto segment per maintained projection (subsumption closure, classification, rule-supports, etc.). Each segment is content-keyed; on TBox or rule change the affected segments invalidate; unchanged segments persist. The section is optional (per the §2 table): an .oxbin that omits it is fully functional — the runtime saturates from canonical events on first query. Production deployments typically ship the cache; development builds and minimal-footprint distributions may omit it.
This combination is novel as a shipped artifact. RDFox and VLog persist saturation but not per-fact provenance. Provenance-aware reasoners (Bourgaux/Ozaki/Peñaloza 2023) prove polynomial-time guarantees but do not ship persisted artifacts. Argon ships both. Bourgaux 2023’s idempotent-semiring tractability theorem ratifies the polynomial cost of carrying PosBool(M) DNF alongside the cache.
11. PosBool(M) provenance — value field, lex-sorted DNF
Per the workspace’s settled conventions: PosBool(M) DNF is a value field on every event row, encoded as JSONB-shape inside CBOR. The DNF carries lex-sorted clauses; within each clause, lex-sorted witnesses; within each witness, the CBOR tag (65536 / 65537 / 65538) discriminates axiom_id / rule_id / module_id.
PosBool(M) is not a semiring weight on a Z-set algebra. It is a value field. The reason is algebraic: PosBool is a distributive lattice without additive inverses; a Z-set group requires inverses; so direct PosBool-weight lift is unsound. The settled approach (provenance updates as side-effect via the Green/Karvounarakis/Tannen 2007 homomorphism) is the production form.
Bourgaux/Ozaki/Peñaloza 2023’s main theorem: under the multiplicative-idempotent-semiring restriction (which PosBool(M) satisfies), provenance-aware ontology completion is polynomial. The same algebraic property that ruled out the DBSP weight lift gives Argon polynomial provenance-aware reasoning. One constraint, two unrelated benefits.
12. Four-valued (Kleene–Belnap) refinement — query-side, not storage-side
Argon’s refinement semantics under open-world assumption are four-valued (Kleene–Belnap: T / F / U / B). Kleene’s three-valued logic supplies the {T, F, U} fragment (Unknown for missing support); Belnap’s 1977 FOUR extends this with B (Both / Paradox) for facts supported in conflicting directions. The combined Kleene–Belnap system is four-valued; calling it “three-valued” elides the B case and would mislead implementors. No surveyed work bridges Standpoint Logic with four-valued semantics; the bridge is an Argon invention.
Argon’s bridge: storage stays two-valued; four-valuedness lives in query interpretation. The events section records each source’s claim (T or F) per row. Query semantics aggregate to {T, F, U, B} per Kleene–Belnap by interpreting the PosBool(M) DNF: a fact with only T-supports renders as T; a fact with only F-supports renders as F; a fact with both T and F-supports under conflicting standpoint perspectives renders as B; a fact with no supports under the queried standpoint renders as U.
The schema does not change. The runtime contract specifies the query-side interpretation. Empirical validation against the lease-story workload is required during implementation; if real workloads need cell-level four-valued storage the approach revisits.
13. Hot replacement — bitemporal basis-T preservation contract
The artifact-runtime contract for hot replacement is novel as a contract specification. The technical capability already exists (the bitemporal valid_time × tx_time event log + the kernel’s query-at-basis API). What this RFD adds is the explicit specification.
Contract (the runtime backend implements this):
- On artifact replacement (a
.oxbinswapped in at tx_time T₂), in-flight queries that began at basis T₁ continue to see the artifact’s state at T₁. The runtime keeps the old artifact in memory until those queries complete. - New queries initiated after T₂ see the new basis.
- Two-version invariant (BEAM lesson). At most “current” and “old” coexist in the runtime simultaneously. Loading a third version waits for queries on the old to complete (or aborts them per a runtime policy flag, with the cost surfacing as a metric).
- The runtime advertises the basis-T it serves on every response. Consumers see which basis their answer was computed against.
Hot replacement is therefore a property of the runtime’s discipline, not of the artifact’s structure. The artifact carries the bitemporal columns; the runtime preserves the basis-T invariant.
14. Runtime backend trait surface
A backend that consumes a .oxbin implements the trait. Module handles are shared via Arc<Module> so in-flight queries can keep their handle alive while the runtime atomically swaps in a new one (the two-version invariant in §13). Every method takes &self — implementors use interior mutability (ArcSwap<Module> for the current-module pointer, plus a per-Store RwLock or per-overlay Mutex for append’s write path) so concurrent queries and a hot replacement never compete for &mut self. This is the standard Rust pattern for the “old code keeps running” invariant.
#![allow(unused)]
fn main() {
pub trait OxbinRuntime: Send + Sync {
/// Validate + load an artifact. Honours Layer 1 + Layer 2.
/// Sets the runtime's "current" handle to the returned `Arc<Module>`
/// and returns it for the caller's use.
fn load(&self, artifact: &Oxbin) -> Result<Arc<Module>, LoadError>;
/// Hot-replace the loaded artifact. Honours the §13 basis-T
/// preservation contract: the runtime atomically installs the new
/// module as "current" while keeping `old` reachable through every
/// existing `Arc<Module>` clone. In-flight queries that already
/// hold the old handle complete against it; new `current()` calls
/// return the new handle. Returns the new handle.
fn replace(&self, old: Arc<Module>, new: &Oxbin) -> Result<Arc<Module>, LoadError>;
/// Cheap snapshot of the runtime's current module handle. Callers
/// initiating a new query / mutation grab a fresh `Arc<Module>`
/// here; they retain it for the duration of their work.
fn current(&self) -> Arc<Module>;
/// Query at a specified bitemporal basis. Returns rows with
/// provenance. The caller passes the `Arc<Module>` they want to
/// query against — typically the result of a recent `current()`.
fn query(&self, module: &Arc<Module>, q: Query, basis: Basis) -> QueryResult;
/// Apply a mutation as a dry-run (no persistence). Returns the
/// would-emit + would-retract delta with provenance.
fn dry_run(&self, module: &Arc<Module>, m: Mutation, basis: Basis) -> MutationDryRun;
/// Append axioms to the runtime's per-tenant overlay (not the
/// artifact's events). The artifact remains immutable; the
/// overlay is the per-tenant Store. Implementations serialise
/// concurrent writers via per-overlay locking.
fn append(&self, module: &Arc<Module>, events: &[Event]) -> Result<(), AppendError>;
/// Capabilities advertisement. Runtime reports max_tier_supported,
/// resource bounds, and feature flags.
fn capabilities(&self) -> RuntimeCapabilities;
}
}
The trait lives in oxc-protocol (or its companion crate) so that any backend — kernel, in-process oxc-runtime, sandboxed bytecode runtime, oxigraph-backed embedded engine — implements the same contract. Argon stays kernel-independent: the kernel becomes one implementation of OxbinRuntime; alternative backends are first-class.
The Send + Sync super-trait bound is required so a single runtime instance can be shared across a thread pool. The contract is that all methods are concurrency-safe; the implementor decides how (mutexes, atomic pointers, lock-free structures).
15. Resource bounds — three orthogonal axes
A runtime advertises resource bounds along three orthogonal axes (per the Wasmtime-derived two-axis pattern, with the third axis being Argon-specific):
- Memory — heap / table / instance count limits (Wasmtime
ResourceLimiter-style). - CPU — fuel-bounded (deterministic, ~3× slowdown) or epoch-bounded (non-deterministic, ~10% slowdown). Wasmtime
consume_fuel/epoch_interruptionare the precedents. - Rule-fire count — Argon-specific; bounds saturation work. A
tier:closureruntime advertises a finite rule-fire bound; atier:folartifact section declines this bound (FOL has no termination guarantee in general).
Resource limits are part of the runtime’s capability advertisement. Artifacts may declare resource requirements in global-control; the runtime refuses an artifact whose declared requirements exceed its advertised bounds.
16. Diagnostic codes introduced
| Code | Severity | Trigger |
|---|---|---|
OE1001 | Error | Artifact magic / preamble invalid. |
OE1002 | Error | Section directory references a section beyond the file size. |
OE1003 | Error | MANDATORY-flagged section with unknown type. |
OE1004 | Error | Layer 1: max_tier_claimed > max_tier_supported. |
OE1005 | Error | Layer 2: symbol resolution failed. |
OE1006 | Error | Layer 2: lattice has a cycle. |
OE1007 | Error | Layer 2: provenance JSONB malformed. |
OE1008 | Error | Layer 2: composition signature mismatch (cache invalid or filesystem corruption). |
OE1009 | Error | Layer 2: tier-table inconsistent with max_tier_claimed. |
OE1010 | Error | Layer 2: doc-block has unresolvable cross-link. |
OW1011 | Warning | Lenient validation: a Layer-2 failure was tolerated; section skipped. |
OE1012 | Error | Resource requirement declared by artifact exceeds runtime’s advertised bound. |
OE1013 | Error | Hot-replace aborted: two-version invariant would be violated and runtime policy is abort; in-flight queries on the prior artifact were terminated. |
OW1014 | Warning | Hot-replace queued: two-version invariant would be violated; replacement deferred until in-flight queries on the prior artifact complete. Normal degraded-performance path; not a failure. |
Codes register in oxc-protocol::core::codes.
17. CLI surface
ox compose— produce a workspace.oxbin. Per RFD-0034.ox run <oxbin> [--query <q>] [--mutate <m>] [--basis <T>]— execute against an artifact via the in-process runtime. Returns answers + provenance.ox inspect <oxbin>— surface the artifact’s structure: section directory, content hashes, max_tier, standpoint count, event count, projection-cache state, doc-block coverage, declared resource requirements.ox bench <oxbin> [--cold | --warm]— run the benchmark harness against an artifact. Cold-start, query latency, mutation dry-run latency.oxc instantiate <package> --wiring <wiring.json>— emit a per-package.oxccache, instantiated against the supplied wiring diagram. Defined in RFD-0034 §10. Driven byox compose; rarely invoked directly.oxc dump <oxbin-or-oxc> <section>— dump a section’s CBOR (or Cap’n Proto) decoded to JSON for compiler-internal debugging. Accepts either a workspace.oxbinor a per-package.oxccache (the section model is the same skeleton at different scopes per §1).
ox run is the user-facing surface. oxc dump is the toolsmith surface. The kernel does not invoke either; it consumes .oxbin directly via the OxbinRuntime trait.
18. Backwards compatibility
This RFD lands at oxbin_format_version = 1. Earlier versions did not exist; there is no migration burden.
Future minor bumps add sections (additive only). Future major bumps may rearrange the section model; consumers detect via oxbin_format_version and refuse-or-upgrade. Multi-release overlays carry both old + new sections in parallel during the migration window.
The tier_ladder_version is initially 1; the existing tier ladder (RFD-0004) is the v1 vocabulary. The kernel_api_version is initially aligned with the current kernel HTTP API.
Rationale
The structural skeleton is settled by literature. ELF (1995), Erlang BEAM (1986), WebAssembly (2017), HDT (2013), and Lean 4.22 multi-file .olean.* (2025) independently arrived at the same shape: file-magic + global-control + section directory with content hashes + per-section sub-headers + reservation-based extensibility. Five mature systems across thirty years and four lineages converging on the same skeleton is a strong signal. RFD-0035 inherits the skeleton.
Layered encoding is the production discipline. CBOR (RFC 8949 deterministic) is the right choice for structural sections because the encoding is well-specified, the deterministic discipline gives content-addressing for free, and the workspace already uses CBOR for axiom-event bodies. Cap’n Proto is the right choice for hot mmap’d cache because zero-copy load is what makes the cache pay off; an entirely-CBOR cache would defeat its purpose. HDT-PFC for the symbol table is the targeted choice for high-prefix-redundancy qualified-path strings.
Three-axis versioning matches ONNX’s lesson. ONNX’s IR / opset / model split is the cleanest precedent for independent evolution of artifact format, vocabulary, and consumer-interface versions. JEP 238 (multi-release JAR) is the lesson for in-major migration. Argon’s three axes (oxbin_format_version, tier_ladder_version, kernel_api_version) inherit both lessons.
Merkle-tree-of-sections content-addressing is the modern build-system discipline. Bazel CAS, Nix store, Datomic immutable segments, and RFC 8949 deterministic CBOR all push toward content-keyed substructure with selective invalidation. Per-section content hashing makes a TBox change touch only the events section; rules + lattice + tier-table caches survive. The discipline is uniform across modern build systems.
Hybrid validation matches the production-grade pattern. WASM’s closed-boolean validate + JVM’s per-class verifier are both load-time disciplines, applied at different granularities for different reasons. WASM’s check is O(1) and refuses large categories of invalid programs cheaply; JVM’s verifier handles fine-grained per-section invariants. Argon needs both: tier mismatch is O(1) closed-boolean (Layer 1); symbol resolution and lattice acyclicity are per-section (Layer 2).
Streaming load order with lazy heavy payloads is the cold-start discipline. Lean 4’s compactedRegion + Idris 2’s ContextEntry lazy-deserialise pattern + HDT’s wavelet-tree-on-load all converge on the same approach: load the structural prefix synchronously, mmap the rest, deserialise on first access. Cold-start cost is dominated by the synchronous prefix; the empirical target (sub-25 ms on lease-story) is the engineering goal.
Tier-typed sectioning is Argon invention forced by the tier ladder (RFD-0004). eBPF’s load-time-static-analysis-then-execute discipline applies; eBPF’s single-envelope type system (accept/reject) does not. Argon’s ladder needs ladder-typed loading. The sketched approach (closed-boolean Layer 1 against max_tier_claimed) is the smallest invention that delivers the discipline; per-section tier filtering is a v2 extension when granularity matters.
Standpoint-lattice serialisation is novel. No surveyed system carries N standpoints in one persistent artifact with efficient cross-standpoint indexing. Argon’s column-discriminator + lattice DAG section + precomputed ancestor sets is the smallest design that delivers the capability. Strass et al. 2023’s prototype admits scalability issues; Argon’s optimisation is the answer that paper invites.
Saturation cache + per-fact provenance combination is novel as a shipped artifact form. No surveyed system ships both. Bourgaux 2023’s polynomial-time guarantee under idempotent semirings ratifies the cost. The combination is forced by the workspace’s settled provenance + IVM model.
Four-valued bridge handled in query interpretation, not storage. No published bridge between Standpoint Logic and Kleene–Belnap exists. Argon’s choice — keep storage two-valued, render four-valuedness ({T, F, U, B}) in query semantics via PosBool DNF interpretation — is the smallest invention; it does not change the schema; it admits empirical revision if real workloads need cell-level four-valuedness.
Hot replacement as a contract, not a mechanism. BEAM’s lesson: hot module replacement works because the language is functional and processes are isolated, not because the runtime has a clever swap mechanism. Argon’s analogue is bitemporal basis-T preservation: a query at basis T₁ continues against the old artifact even if a new artifact arrives at T₂. The technical capability already exists (bitemporal valid_time × tx_time + query-at-basis); RFD-0035 adds the contract specification.
Runtime-backend trait surface decouples Argon from any single runtime. The kernel is one implementation; an in-process oxc-runtime library is another; future sandboxed bytecode and external embedded engines (e.g., oxigraph-backed) are anticipated. Argon stays kernel-independent because the kernel is a backend, not the privileged consumer.
Three-axis resource bounds extend Wasmtime’s two axes. Memory + CPU is Wasmtime’s factoring; Argon adds rule-fire-count as the saturation analogue. The tier:closure runtime offers a finite bound; the tier:fol artifact section declines it; the type system propagates correctly through the validation layer.
Consequences
- New crate
oxc-oxbin(or equivalent) implements the format. Reader, writer, validation. Includes the deterministic CBOR helpers + Cap’n Proto schema + HDT-PFC symbol-table encoder/decoder. oxc-protocol::core::oxbincarries the typed wire shapes for every section (events, lattice, tier-table, projection-cache via Cap’n Proto FFI types, doc-blocks). The codegen drift gate covers them.oxc-protocol::core::oxbin_runtimecarries theOxbinRuntimetrait and supporting types (Module, Basis, QueryResult, MutationDryRun, RuntimeCapabilities, LoadError, AppendError).- Kernel storage layer becomes a
OxbinRuntimeimplementation. Today’s per-tenant overlay becomes the Store; the kernel’s reasoner becomes the runtime; the artifact is the formalised input. - In-process
oxc-runtimelibrary becomes aOxbinRuntimeimplementation. Whatox check/ox test/ox docuse today becomes named explicitly. - Future runtime backends (sandboxed bytecode for agent-side execution, oxigraph-backed embedded engine, distributed peer/transactor split per Datomic) are first-class — implement the trait and they’re a runtime choice for
ox run. - Diagnostic codes
OE1001–OE1010,OW1011,OE1012–OE1013,OW1014register inoxc-protocol::core::codes. TheOWprefix onOW1011andOW1014is correct: per RFD-0024’s severity convention, warnings carryOW, errors carryOE. Code-uniqueness checks span both prefixes. - Empirical questions land as benchmark targets in the implementation:
- CBOR vs Cap’n Proto byte-size delta on the lease-story projection cache.
- Cold-start cost on lease-story (target: sub-25 ms for the synchronous prefix).
- Symbol-table front-coding compression ratio on lease-story qualified-path namespace.
- DRedc determinism in parallel rule-firing — does the cache content-hash stay stable across multiple
ox composeinvocations? - Standpoint-lattice ancestor-set cache size at lease-story scale.
- Provenance JSONB DNF growth under heavy rule-firing — does it dominate artifact size?
- Specific ratification questions for the implementation phase:
- CBOR tag numbers 65536-65538 for PosBool witnesses — final. Argon will submit a lightweight FCFS application to IANA before the
.oxbinformat graduates to a public-stable surface; until then deployment is internal-only and the major-version-bump migration path mitigates collision risk. - Bump-trigger policy for the three version axes.
- Cap’n Proto canonical-form rules per segment for content-addressing.
- Lenient/strict default per runtime kind.
- Specific list of
MANDATORY-flagged sections in v0.0.1 (tier-tableis mandatory; others TBD). - Specific section type assignments in
100..199. - Cyclic
pub usechain handling — confirmed in RFD-0034 as reject-by-default; recheck against real Argon use-cases.
- CBOR tag numbers 65536-65538 for PosBool witnesses — final. Argon will submit a lightweight FCFS application to IANA before the
- The benchmark harness (
ox bench) lands with the implementation. Lease-story serves as the canonical workload; UFO and the full BFO smoke test extend it as those packages stabilise.
This RFD’s scope is the artifact format and the runtime contract. The composition pipeline that produces the artifact is RFD-0034. The doc-rendering layer that emits HTML from doc-blocks is RFD-0029 (Phase 3). Future RFDs will cover specific runtime backends (bytecode runtime; embedded engine), the saturation cache invalidation semantics during multi-package re-composition, and any subsequent extensions.
Historical lineage
This RFD is largely novel as a combined artifact form. The structural shape is grounded in convergent prior art across compiled-artifact engineering and knowledge-graph persistence:
Sectioned binary skeleton: ELF (System V ABI gabi ch. 4); Erlang BEAM IFF (Armstrong, Programming Erlang ch. 8); WebAssembly Core Specification §5 (Haas et al., “Bringing the Web up to Speed with WebAssembly,” PLDI 2017); HDT (Fernández et al., “Binary RDF Representation for Publication and Exchange (HDT),” J. Web Semantics 2013); Lean 4.22 multi-file .olean.* (lean-lang.org/doc/reference/4.22.0).
Encoding choices: CBOR per RFC 8949 (Bormann & Hoffman 2020); Cap’n Proto (capnproto.org); HDT-PFC (Fernández et al. 2013 §3.1).
Versioning model: ONNX three-axis (IR / opset / model — onnx.ai/onnx/repo-docs/IR.html); JEP 238 (multi-release JAR — openjdk.org/jeps/238); BEAM chunk-rename-on-encoding-change (J. Högberg, erlang/otp#9336).
Content-addressing: Bazel CAS; Nix store; Datomic immutable segments (Hickey, Datomic Cloud Architecture docs); RFC 8949 §4.2.1 deterministic encoding (Bormann & Hoffman 2020).
Validation: WASM Core Specification §5.5 (validation as typed property); JVM class file verifier (Lindholm, Yellin, Bracha, Buckley, The Java Virtual Machine Specification ch. 4.10).
Streaming load + mmap: Smalltalk-80 image (Goldberg & Robson, Smalltalk-80: The Language and Its Implementation 1983); SBCL save-lisp-and-die; Lean 4 compactedRegion (de Moura & Ullrich, “The Lean 4 Theorem Prover and Programming Language”); Idris 2 ContextEntry (Brady, “Idris 2: Quantitative Type Theory in Practice,” ECOOP 2021).
Hot replacement contract: Erlang OTP “Code and Code Loading” guide; Armstrong, Programming Erlang (2007) ch. 8 — the “two-version invariant + voluntary migration + functional language preconditions” pattern.
Engine / Module / Store factoring: Wasmtime (Engine/Module/Store); JVM HotSpot (compilation cache / Class<T> per loader / VM state per process); Truffle / GraalVM (engine / language context / polyglot instance).
Resource bounds: Wasmtime ResourceLimiter + consume_fuel + epoch_interruption (wasmtime.dev docs examples-interrupting-wasm.html).
Argon-specific (invention surface):
- Tier-typed sectioning — eBPF discipline applies; eBPF type system does not. Argon’s invention forced by RFD-0004’s tier ladder.
- Standpoint-lattice serialisation in one binary artifact with cross-standpoint indexing — no surveyed system. Sketched approach inherits the encoding precedent from Strass, Gómez Álvarez & Rudolph, “Standpoint EL+: A Family of Logic-Based Reasoning Procedures” (KR 2023), with the precomputed ancestor sets being the optimisation that paper’s prototype admits is needed.
- Saturation cache + per-fact PosBool(M) provenance shipped together — no surveyed system. Bourgaux, Ozaki & Peñaloza, “Semiring Provenance for Lightweight Description Logics” (2023, arXiv:2310.16472) ratifies polynomial cost under PosBool(M)’s idempotency.
- Kleene–Belnap query-side four-valued interpretation — no published bridge between Standpoint Logic and four-valued semantics. Argon’s interpretation lives in query semantics, not storage. Belnap, “A Useful Four-Valued Logic,” 1977 (in Modern Uses of Multiple-Valued Logic) supplies the {T, F, U, B} truth-value set; Kleene’s strong three-valued logic supplies the {T, F, U} fragment.
- Cross-package standpoint lattice composition — RFD-0034. No surveyed system addresses cross-package lattice merging.
- Bitemporal basis-T preservation hot-replacement contract — BEAM’s hot-replacement pattern lifted to the bitemporal basis. Technical capability is from Snodgrass-style bitemporal databases + the workspace’s existing query-at-basis API; the contract specification is new.
The sum of these is unprecedented. Argon will be the first system to ship a typed knowledge container + program + composed link target + derived-state cache + provenance archive + tier-typed safety container + standpoint-lattice carrier in one binary artifact. The literature demonstrates that each ingredient is tractable; the combination is what this RFD locks.