Collections
Most real models eventually need to talk about many things at once. A building has units; a unit has tenants; a lease has parents on the title; a tax filing has dependents. Argon’s collection surface — sets, lists, maps, optionals, and ranges — gives you the shapes you reach for, the operations that go with them, and an expression-level surface (method calls, indexing, slicing, comprehensions, membership tests) that reads like the languages you already know.
This chapter sticks to the collection substrate that ships with the language. Everything in it lives under std::collection (with Range under std::math); none of it is a user-declarable concept, and none of it comes from a foundational-ontology package. The reason that matters is the next section.
Why collections live in the substrate
Argon does not admit user-declared parametric concepts. A modeler cannot write pub kind Container<T> and have the type-checker treat Container[Person] as a distinct type from Container[Organization]. That position is deliberate: parametric concepts would commit the language to a particular reading of how foundational ontologies handle type families, which would couple Argon to a specific foundation. Argon stays foundation-neutral on that question; user code consumes a small closed set of parametric type constructors, and the substrate ships them.
The collection surface is therefore a fixed inventory:
| Constructor | Module path | Reading |
|---|---|---|
Set[T] | std::collection::Set | Unordered, distinct, no positional access. |
List[T] | std::collection::List | Ordered, allows duplicates, indexable. |
Map[K, V] | std::collection::Map | Key-value; K must be orderable. |
Optional[T] | std::collection::Optional | Zero or one value (T? is sugar). |
Range[T] | std::math::Range | An ordered interval over an orderable primitive. |
Range lives under std::math, not std::collection, because it is parametric over any orderable primitive (a numeric range, a date range, a money range) rather than collection-shaped. The four under std::collection carry an operation surface that the language treats uniformly; Range carries a small operation surface of its own.
The substrate cannot ship a new constructor without a compiler change. That is the cost of foundation-neutrality on this point: the door to user-declared parametric concepts stays shut, and the closed inventory is the consolation. The everyday modeling cases fit.
The five type constructors
A collection-typed field, parameter, or return looks like any other type position:
use std::math::{Nat, Text, Money, Date}
use std::collection::{Set, List, Map, Optional}
use std::math::Range
pub kind Building {
id: Text,
units: List[Unit],
tenants: Set[Person],
primary_contact: Optional[Person],
rent_band: Optional[Range[Money]],
}
pub kind Unit {
number: Text,
occupants: Map[Date, Person],
}
The brackets Set[T] are mandatory. Argon’s expression grammar reserves < and > for comparison; angle-bracket generic syntax would fight with a < b and force lookahead. Brackets make the grammar unambiguous.
T? is sugar for Optional[T]. The two spellings lower to the same type at elaboration, and either is well-typed:
pub kind Lease {
cosigner: Optional[Person],
promo_code: Text?,
}
T? is the idiomatic surface for the common 0-or-1 case. Reach for the unsugared Optional[T] when you want the optionality to read prominently in the field shape — typically when the inner type itself is a collection (Optional[List[Person]] reads more clearly than List[Person]?).
A Map[K, V] requires K to be orderable. Concept-typed keys are admitted because every kernel id orders; primitive keys order naturally. The map is built on the same deterministic ordering the kernel uses elsewhere — never insertion order.
A Range[T] is parametric over any orderable primitive. The most common instances are Range[Nat], Range[Money], and Range[Date].
For Agents: every collection type constructor uses brackets, not angle brackets. Writing
Set<T>is a parse error today. The token<always means “less than” in this grammar.
Field cardinality vs List[T]
Concept declarations have used multi-valued field syntax since Chapter 2.3:
[Tenant] // any number, including zero
[Tenant; >= 1] // at least one
[Modifier; <= 3] // at most three
[Field; == 4] // exactly four
That form is for relation-shaped fields: the cardinality bound is a structural constraint on the underlying binary relation, not a collection object. It lowers to Set[T] semantics by default. The cardinality clause survives elaboration as a constraint over the relation’s count.
List[T], Set[T], Map[K, V], and Optional[T] are collection-typed fields: the field’s value is a single collection object that carries its members. Cardinality on these is expressed by predicates over .size() in a where clause, not by an inline bound.
When do you reach for which?
- The field is conceptually a relation between the owning concept and others (a building’s tenants; an organisation’s employees): write
[T; <bound>]. Order does not matter; duplicates are not meaningful. - The field’s value is conceptually a collection object the modeler reaches into with methods, comprehensions, or indices: write
Set[T]/List[T]/Map[K, V]/Optional[T]. - The field is ordered: write
List[T], or (if you want to keep the cardinality-bound surface)[T; <bound>, ordered].
The two forms share storage shape — both lower to a collection value at the data layer — but the expression-level operations differ. Method calls (xs.size(), xs.append(x)) are available on the collection-typed forms; relation-shaped fields participate in rule bodies through the relation’s atoms.
One special case carries a quickfix hint:
pub kind Lease {
cosigner: [Person; <= 1], // OW2402 fires here
}
[T; <= 1] is a singleton-bounded set. The semantics are identical to Optional[T], but the reading is different: a <= 1 cardinality reads like “at most one tenant”, which is fine; Optional[T] (T?) reads like “a value that may or may not be present”, which is what the surface usually means. The compiler emits OW2402 suggesting the rewrite. Take the suggestion when the intent is “value that may be absent”; keep [T; <= 1] when the intent is “this relation, capped at one”.
Method-call surface
Every collection operation is invocable by method-call syntax against a typed receiver:
pub compute count_units(b: Building) -> Nat = b.units.size()
pub compute knows_tenant(b: Building, p: Person) -> Bool = b.tenants.contains(p)
The xs.m(a, b) form desugars at elaboration time to <TypeOf(xs)>::m(xs, a, b) — a qualified call into the receiver’s submodule. There are no traits; there is no runtime method-resolution table; the receiver’s type at the call site determines the submodule, and the submodule lookup succeeds or fails at elaboration.
b.units.size()
// elaborates to
std::collection::List::size(b.units)
The desugar is mechanical and total. Unknown method names produce OE2402 UnknownCollectionMethod with a “did you mean” hint listing the valid methods on the receiver’s submodule. Wrong-typed receivers — calling .size() on something that isn’t a collection — produce OE0101 UnresolvedIdentifier at the method position.
Chained calls compose left-to-right:
pub compute active_unit_count(b: Building) -> Nat =
b.units.filter(unit_is_active).size()
The chain is read as (b.units).filter(unit_is_active).size() and elaborates two qualified calls. Each intermediate value carries its own type; the chain is well-formed exactly when each step’s receiver matches the next step’s expected receiver.
Higher-order arguments accept two forms:
- Compute references — the bare name of a
pub computein an argument position passes that compute as a callable value. The elaborator validates the compute’s signature against the operation’s expected closure shape; mismatches fireOE2407. - Comprehensions — inline transformations that don’t need a named function. See the comprehensions section below.
First-class lambdas (|x| x + 1) and explicit function-type syntax (T -> U) are deferred. The two forms above cover the common cases without committing the language to a function-type vocabulary.
Comprehensions
A comprehension produces a List[U] (or, when the source is a Set, a Set[U]) by walking a source collection, optionally filtering, and projecting each surviving element:
pub compute active_unit_numbers(b: Building) -> List[Text] =
[u.number for u in b.units where unit_is_active(u)]
The shape is [<projection> for <binder> in <source> where <predicate>]. The where clause is optional. The binder is a fresh name local to the comprehension body; the projection and the predicate both see it.
Comprehensions reuse the same binding subgrammar that aggregates use in rule bodies — sum(r for s in path where pred). The semantics are the same: walk the source, retain elements satisfying the predicate, project the surviving expression.
Multiple binders are not yet admitted in the v1 surface — one source, one binder, optional where. Multi-source comprehensions, when they land, will read as [expr for x in xs, y in ys where pred] and desugar to nested filter-maps.
A comprehension desugars to a filter-map chain:
[u.number for u in b.units where unit_is_active(u)]
// elaborates to
b.units.filter(unit_is_active).map(<closure projecting u.number>)
The projection’s closure has the same elaboration discipline as a named compute: the binder is in scope, and the body must type-check against the operation’s expected closure shape.
Shadow-with-warning. If the comprehension’s binder name collides with an in-scope let binding or parameter, the comprehension shadows the outer binding within its body and the compiler emits OW2403 ComprehensionBinderShadowsOuter. The warning matches what Datalog rule-body conventions already do — fresh binders per rule body — and keeps modelers from accidentally believing the outer name is in scope. Rename the binder to silence the warning.
pub compute weird(units: List[Unit], unit: Unit) -> List[Text] =
[unit.number for unit in units] // OW2403 — `unit` shadowed
Indexing, slicing, membership
Three postfix forms extend the method-call surface:
Indexing. xs[i] desugars to <TypeOf(xs)>::at(xs, i) returning Optional[T]. The index type must be Nat for a List; for a Map, the index must match the declared key type and the operation desugars to Map::get. Set[T] rejects indexing with OE2406 — set elements have no positional identity, so s[0] has no meaning.
pub compute first_unit(b: Building) -> Optional[Unit] = b.units[0]
pub compute tenant_at(b: Building, d: Date) -> Optional[Person] =
b.units[0].occupants[d]
Slicing. xs[i..j] desugars to <TypeOf(xs)>::slice(xs, Range::new(i, j)), returning List[T]. Half-open i..j, open-ended i.. and ..j are admitted. Slice bounds are checked at elaboration time when both bounds are literals (xs[5..2] fires OE2405); dynamic bounds defer the check to runtime.
pub compute first_three(b: Building) -> List[Unit] = b.units[0..3]
pub compute tail(b: Building) -> List[Unit] = b.units[3..]
Range literals. i..j constructs a half-open Range[T]; i..=j constructs an inclusive range. The range never materializes its element sequence in the v1 surface; Range::contains(r, x) answers the membership question without iterating.
pub compute rent_in_band(l: Lease, lo: Money, hi: Money) -> Bool =
Range::new(lo, hi).contains(l.monthly_rent)
Membership. x in xs desugars to <TypeOf(xs)>::contains(xs, x). x not in xs desugars to the negation. The form is admitted at expression position and at rule-atom position:
pub derive premium_unit(U) :-
Unit(U),
monthly_rent(U, R),
R in Range::new(2500, 5000)
pub compute is_listed(b: Building, p: Person) -> Bool = p in b.tenants
Operation catalog
The v1 surface ships a fixed set of operations per type constructor. The table records the operation’s signature, its return shape, its decidability tier (the per-context admission cell uses this — see the next section), and a one-line example.
The signature notation reads op(receiver, args...) -> Return. Where the return depends on the receiver’s element type or on a closure argument’s return type, that is called out.
Set[T]
| Op | Signature | Return | Tier | Example |
|---|---|---|---|---|
of | variadic | Set[T] | closure | Set::of(1, 2, 3) |
empty | nullary | Set[T] | closure | Set::empty() |
size | (Set[T]) -> Nat | Nat | closure | s.size() |
contains | (Set[T], T) -> Bool | Bool | closure | s.contains(x) |
union | (Set[T], Set[T]) -> Set[T] | receiver | closure | s.union(t) |
filter | (Set[T], pred) -> Set[T] | receiver | expressive | s.filter(p) |
List[T]
| Op | Signature | Return | Tier | Example |
|---|---|---|---|---|
of | variadic | List[T] | closure | List::of(1, 2, 3) |
size | (List[T]) -> Nat | Nat | closure | xs.size() |
contains | (List[T], T) -> Bool | Bool | closure | xs.contains(x) |
at | (List[T], Nat) -> Optional[T] | Optional[T] | closure | xs.at(0) |
append | (List[T], T) -> List[T] | receiver | closure | xs.append(x) |
slice | (List[T], Range[Nat]) -> List[T] | List[T] | closure | xs.slice(0..3) |
map | (List[T], f: T -> U) -> List[U] | derived | expressive | xs.map(f) |
filter | (List[T], pred) -> List[T] | receiver | expressive | xs.filter(p) |
fold | (List[T], U, f: (U, T) -> U) -> U | accumulator | recursive | xs.fold(0, f) |
Map[K, V]
| Op | Signature | Return | Tier | Example |
|---|---|---|---|---|
of | variadic key-value pairs | Map[K, V] | closure | Map::of((k, v), ...) |
get | (Map[K, V], K) -> Optional[V] | Optional[V] | closure | m.get(k) |
Optional[T]
| Op | Signature | Return | Tier | Example |
|---|---|---|---|---|
Some | (T) -> Optional[T] | Optional[T] | closure | Some(x) |
None | () -> Optional[T] | Optional[T] | closure | None() |
is_some | (Optional[T]) -> Bool | Bool | closure | o.is_some() |
is_none | (Optional[T]) -> Bool | Bool | closure | o.is_none() |
unwrap_or | (Optional[T], T) -> T | element | closure | o.unwrap_or(default) |
map | (Optional[T], f: T -> U) -> Optional[U] | derived | expressive | o.map(f) |
Range[T]
| Op | Signature | Return | Tier | Example |
|---|---|---|---|---|
new | (T, T) -> Range[T] | Range[T] | closure | Range::new(0, 10) |
contains | (Range[T], T) -> Bool | Bool | closure | r.contains(x) |
The v1 surface is intentionally narrow. Set algebra (intersection, difference, symmetric difference), additional list ops (prepend, insert_at, sort, take_while, …), additional optional ops (flat_map, and, or), and Range::collect (which would materialize a range’s elements) are tracked as follow-up work. Until they land, modelers compose the v1 ops via comprehensions and fold.
Tier dispatch matrix
Every operation carries a decidability tier (the rightmost column of each table above). The tier interacts with the surrounding evaluation context: different bodies admit different sets of operations.
| Context | closure-tier | expressive-tier | recursive-tier |
|---|---|---|---|
pub compute body | yes | yes | yes |
pub mutation do { } | yes | yes | yes |
pub derive body | yes | reject (OE2408) | reject (OE2408) |
query body | yes | reject (OE2408) | reject (OE2408) |
Refinement where { } | reject (OE2408) | reject | reject |
test block | yes | yes | yes |
The reading is:
pub compute/pub mutation/testadmit the full operation surface. These contexts are where transformations live; they are tier-tolerant by design.pub derive/queryadmit only closure-tier operations. Rule bodies in these contexts feed the kernel’s saturation; higher-tier operations risk pushing the rule above the closure ceiling that derivation depends on for tractable evaluation.- Refinement
where { }rejects every collection operation. Refinement bodies admit predicates over the metatype calculus only — see Chapter 2.6 — and collection ops are out of fragment.
The closure-tier operations (size, contains, union, at, append, slice, get, Some, None, is_some, is_none, unwrap_or, Range::new, Range::contains) are the ones a rule body or query body can call freely. The expressive- and recursive-tier ops (map, filter, fold) move to a pub compute body if you need them inside a rule’s logic.
A rule-body violation produces OE2408 with a hint suggesting the move:
error[OE2408]: higher-tier collection op in restricted context
--> rules.ar:14:18
|
14 | adult_count = b.tenants.filter(is_adult).size(),
| ^^^^^^^ filter is `tier:expressive`
|
= note: `pub derive` bodies admit only `tier:closure` ops
= help: lift the transformation into a `pub compute` and call it
from the rule body
Functional semantics and the rebuild-and-assign idiom
Every collection operation is pure: it returns a new collection and leaves the receiver unchanged. xs.append(x) does not modify xs; it returns a new List[T] whose elements are xs’s followed by x.
In a pub mutation do { } body, the idiomatic way to “modify” a collection-valued field is to rebuild it and assign:
pub mutation add_parent(l: Lease, p: Person) {
do {
l.parents = l.parents.append(p);
}
return l;
}
The right-hand side computes a new List[Person]; the assignment rebinds the field. The kernel records the change as a new GroundAssertion in the append-only event log — every collection-valued property change is a single event, not a partial-update event. The discipline matches the storage model byte for byte.
In-place mutators (l.parents.insert!(p), l.parents.push!(p)) are deferred. The rebuild-and-assign form is the only mutation idiom in v1.
For optional fields, the same pattern applies:
pub mutation set_primary(b: Building, p: Person) {
do {
b.primary_contact = Some(p);
}
return b;
}
pub mutation clear_primary(b: Building) {
do {
b.primary_contact = None();
}
return b;
}
Optional under the open-world assumption
Optional[T] interacts with Argon’s open-world semantics in a way that matters for two ops: is_some and unwrap_or.
is_some(o) is classical on presence. If o is Some(_) it returns true; if o is None() it returns false. The truth value of the inner element does not enter the determination — is_some(Some(undefined)) returns true because the option has a value, even if that value’s truth status is itself Undefined.
unwrap_or(o, default) returns the inner value when o is Some(_), returning default only when o is None(). Critically, unwrap_or(Some(Undefined), default) returns Undefined, not default. The fallback applies to absence, not to the inner element’s truth status.
The reason: K3 (the three-valued logic the language uses for refinement under OWA — see Chapter 2.6) distinguishes “no value here” (a structural absence) from “value present but its truth is unknown” (an epistemic gap on the value itself). unwrap_or collapsing both cases would lose the distinction; the truth-value semantics rely on the distinction surviving.
When the intent is “if the inner value is undefined, substitute a default,” chain a separate operation that operates on the inner value:
pub compute rent_or_zero(l: Lease) -> Money =
l.monthly_rent.unwrap_or(0)
// If `l.monthly_rent` is `Some(undefined)`, the result is undefined.
// To force a fallback on undefined values, use a refinement
// or `match` over the truth state of the inner value.
The Argon truth-value substrate (RFD-0037) covers the bilattice mechanics. Optional[T]’s semantics are downstream of that substrate: presence is classical; truth-of-the-inner-value follows whichever lattice context the surrounding standpoint declares.
Worked example
A small but real exercise. A residential building owns units; each unit has occupants over time; each occupant has a contact; the building reports a roster.
use std::math::{Nat, Text, Money, Date}
use std::collection::{Set, List, Map, Optional}
use std::math::Range
pub kind Person {
id: Text,
full_name: Text,
contact_email: Optional[Text],
}
pub kind Unit {
number: Text,
occupants: List[Person],
rent: Money,
}
pub kind Building {
id: Text,
units: List[Unit],
rent_band: Optional[Range[Money]],
}
// Closure-tier predicates — usable in rule bodies, query bodies,
// compute bodies, mutation bodies.
pub compute has_occupants(u: Unit) -> Bool = u.occupants.size() > 0
pub compute rent_in_band(u: Unit, band: Range[Money]) -> Bool =
band.contains(u.rent)
// Expressive-tier transformations — compute / mutation / test only.
pub compute occupied_units(b: Building) -> List[Unit] =
b.units.filter(has_occupants)
pub compute unit_numbers(b: Building) -> List[Text] =
[u.number for u in b.units]
pub compute roster(b: Building) -> List[Text] =
[p.full_name for u in b.units where has_occupants(u)
for p in u.occupants]
// Multi-source comprehensions are deferred; today this would
// chain two computes or compose two filter-map calls. Shown
// here as the form the v2 surface will admit.
// Optional unwrap with a fallback.
pub compute contact_or_placeholder(p: Person) -> Text =
p.contact_email.unwrap_or("(no contact)")
// Membership at rule-atom position — closure-tier, admitted in
// derive bodies.
pub derive premium_unit(U) :-
Unit(U),
rent(U, R),
R in Range::new(2500, 5000)
// Mutation — rebuild and assign.
pub mutation add_occupant(u: Unit, p: Person) {
do {
u.occupants = u.occupants.append(p);
}
return u;
}
// Setting an optional field.
pub mutation set_rent_band(b: Building, lo: Money, hi: Money) {
do {
b.rent_band = Some(Range::new(lo, hi));
}
return b;
}
Running ox check against this module exercises every operator the chapter introduced: bracket type construction, the ? sugar at none of the field sites (intentional — every optional field here uses the unsugared Optional[T] so the optionality reads loud), method-call dispatch, comprehension, indexing through at, range construction and membership, the in operator at rule-atom position, and the rebuild-and-assign mutation idiom.
The compute roster references multi-source comprehensions; the v1 elaborator does not yet admit them (OE2402 fires today with a hint pointing at the tracking issue). The single-source forms are shipping; the multi-source form lands with the v2 surface.
For Agents
Four idioms cover the bulk of collection work:
- Comprehension over method chain. Prefer
[u.number for u in b.units where unit_is_active(u)]overb.units.filter(unit_is_active).map(unit_to_number)when the projection is a small expression. The comprehension reads closer to the intent. Reach for the chain when each step has a name worth preserving. - Optional unwrap with fallback.
o.unwrap_or(default)covers presence-based defaulting. Remember that the fallback applies to absence only; if the inner value isUndefined, the result isUndefined. When you want “either way, substitute a default”, chain a separate operation that operates on the inner value’s truth state. - Rebuild-and-assign for mutation.
do { l.parents = l.parents.append(p) }is the only way to “modify” a collection field in v1. The functional semantics are deliberate; the kernel’s event log depends on every collection-valued property change being a single GroundAssertion. [T; <=1]should becomeT?. The compiler emitsOW2402on the singleton-bounded form. Take the suggestion when the field’s intent is “value that may be absent”; theOptionalform unlocks the full optional op surface (is_some,unwrap_or,map). Keep[T; <= 1]only when the field genuinely reads as “this relation, capped at one occurrence”.
A fifth, less frequent: the tier matrix is the gate. A map / filter / fold call in a derive rule body fires OE2408; lift the transformation into a pub compute and call the compute from the rule. The tier ceiling is what keeps rule-body evaluation tractable.
Summary
Set[T],List[T],Map[K, V],Optional[T]live instd::collection.Range[T]lives instd::math. Brackets are mandatory;T?is sugar forOptional[T].- Field cardinality
[T; bound]is relation-shaped;List[T]/Set[T]are collection-typed;[T; <= 1]triggers a rewrite hint toT?. - Method-call syntax desugars to a qualified UFCS call at elaboration time. No traits; no runtime dispatch.
- Comprehensions reuse the aggregate-binding grammar; binders shadow with a warning.
xs[i]/xs[i..j]/x in xsdesugar through the catalog. Sets reject indexing.- The operation catalog is the closed v1 surface. Each op carries a tier; the tier-dispatch matrix decides admission per context.
- Mutation is rebuild-and-assign. Functional semantics align with the kernel’s append-only event log.
Optionalunder OWA:is_someis classical on presence;unwrap_orapplies only to absence and preserves inner-value truth status.