RFD-0008 — Tests are first-class items; proof-status attributes for theorem claims
Question
Should Argon source files carry tests directly, or rely on an external test harness?
Decision
Tests are first-class language items declared as test <name> { … } blocks. Test bodies use Argon’s expression grammar (RFD-0005) with expect { … } and fixture { … } sub-blocks. The compiler elaborates tests under capture mode so fixture-internal diagnostics do not leak to the parent’s stream. The test runner reports five status categories: Pass, Fail, Unproven, Assumed, Ignored.
Proof-status attributes mark theorem claims:
#[unproven]— the test states a theorem the system cannot mechanically verify in the current tier. Reported as a distinctUnprovenstatus; carries anunproven_verifiedbit when the system later succeeds.#[assumed]— the test asserts a postulate; the body is injected as an axiom in the test’s scope. Reported asAssumedstatus.
Both attributes admit @[…] AttrList and @<ident> / #<ident> Decorator surface forms.
Rationale
Tests live with the code. Splitting language-level invariants from a separate test harness would have allowed the two surfaces to diverge. Putting tests as first-class items means they’re checked by the same compiler that compiles the rules they test, parsed by the same parser, type-checked against the same TBox.
Capture mode for fixtures. A fixture block is a test-local mini-module. Diagnostics inside the fixture are scoped to the fixture; they do not propagate to the parent test’s stream. This lets a test verify “this code raises diagnostic OE0211” without polluting unrelated tests’ output.
Five status categories, not three. Pass / Fail covers mechanical execution. Unproven / Assumed exposes the theorem-marking discipline: there are claims we believe but cannot verify, and there are postulates we assert. Conflating any of them into Pass or Fail erases a real distinction. Ignored is the explicit-skip path.
Why proof-status attributes replace #[expect_sorry]. The original #[expect_sorry] attribute conflated two concepts: “we know this can’t be proven” (unproven) and “we’re treating this as an axiom for the test” (assumed). Splitting them makes the runner’s report unambiguous and the test author’s intent clearer.
Consequences
testis a reserved item kind with its own AST node (Item::Test).expect { diagnostic <code> [severity:<sev>] [at <marker>] }declarative matching with//~ <label>source markers and span-overlap semantics.fixture { … }blocks elaborated under capture mode.[pub] frame Name { … }declaration form composes test contexts withusing F[, F2…]. Conflict surfaces asOE0214–OE0216,OE0218.- Runner CLI:
ox test --package <prefix>qname-prefix filter. - Test attributes lower to
CoreTest.proof_status. The runner’s report distinguishes the five categories.