RFD-0022 — CQRS projections + DRedc IVM behind a ProjectionMaintainer trait
Question
The kernel computes derived facts from asserted facts via stratified Datalog rules + meta-property calculus. When asserted facts change, what algorithm maintains the derived facts incrementally, and how is that algorithm bound to the system so it can evolve?
Decision
CQRS architecture. Reads and writes are separated. Writes append to the event log (RFD-0021). Reads go through projections — materialized views maintained incrementally by an Incremental View Maintenance (IVM) algorithm.
DRedc IVM is the current algorithm. DRedc (Delete-Rederive with counting) handles the maintenance for stratified Datalog rules. It supersedes the simpler DRed by adding multiplicities; for symmetric / transitive clique predicates, DRedc detection at compile time enables a fallback path that avoids the pathological case.
Algorithm seam: ProjectionMaintainer trait. The IVM algorithm is bound behind a trait. The current implementation is DRedc; future implementations (FBF-2019 for the clique-pathology case, DBSP for streaming workloads) plug in at the same seam without touching projection consumers.
PosBool(M) provenance is a value field, not the IVM weight semiring. The provenance encoding (RFD-0007) lives on each derived fact as a JSONB column. DBSP-style weight semirings (when DBSP becomes the maintainer) operate on Z-set arithmetic separately; the two semantics don’t interfere.
v0.3 migration criteria for DRedc → DBSP are codified separately. Trigger conditions: clique-predicate workload pressure exceeds threshold; streaming derivation patterns become common; compile-time DRedc-pathology detection flags too many programs.
Rationale
CQRS is the only honest model. The kernel’s derived facts are a function of the asserted facts and the rules. Maintaining them in-place would have meant every assertion ran the full saturation pipeline. CQRS lets the saturation cost amortize across reads.
DRedc over DRed. Delete-Rederive is the standard incremental algorithm but degrades on symmetric / transitive clique predicates (every fact’s removal triggers re-derivation of every other fact in the clique). DRedc’s counting addresses this; FBF-2019 handles the residual cases. Compile-time pathology detection lets the system pick the right algorithm per program.
Trait seam future-proofs the algorithm choice. IVM is an active research area. The trait boundary means switching to DBSP, Differential Dataflow, or some future algorithm doesn’t require touching every consumer of derived facts.
PosBool as value field. Provenance is per-fact data, not part of the maintenance arithmetic. Mixing them would have constrained provenance encoding to whatever shape the IVM algorithm’s weight semiring exposes; keeping them separate preserves the freedom to change either independently.
TED (LogicBlox trace-edit-distance) ruled out. Considered as an alternative; the trace-management overhead doesn’t pay for itself at the workload sizes we anticipate.
Consequences
ProjectionMaintainertrait atcrates/nous/src/reasoning/...(current implementation: DRedc).- Compile-time detection of symmetric + transitive clique predicates triggers FBF-2019 fallback.
- Path-closure aggregates and multi-standpoint bridge rules use demand-driven caching projections (scope-setting; not yet fully landed).
- DBSP feasibility tracked in v0.3 conditional roadmap; not committed.
- Provenance remains a JSONB column orthogonal to the maintainer’s internal data structures.