Architecture

One process, zero serialization, all modules share memory. ArcFlow is a SoC modular monolith -- like Apple's M1 chip, which puts CPU, GPU, and RAM on one die with unified memory instead of bolting them together over buses. ArcFlow does the same for data infrastructure: graph storage, vector search, graph algorithms, and GPU dispatch share one GraphStore in one address space. No network hops. No message queues. No external cache. No separate vector database.

This architecture eliminates entire categories of infrastructure: no external cache service (the graph is already in-process), no separate vector database (vector indexes live alongside graph data), no workflow engine for orchestration (procedures run in the same runtime). The result is fewer moving parts, lower latency, and a single binary to deploy.

Design Principles#

Deterministic: Same query + same state = same results. Always.
Local-first: Single binary, full authority, no network dependency.
Agent-native: Structured output, typed errors, machine-readable contracts.
Rust-native: Zero-cost abstractions, memory safety, single binary.
Zero serialization: Modules communicate through shared Rust types, not wire protocols.
Evidence-first: Every fact carries the ArcFlow Evidence Model — observation class, confidence score, provenance chain. Trust is a query dimension, not an afterthought.

Three-Plane Architecture#

Plane	Authority	What lives here
Authored Workspace	Source of truth for intent	Schemas, queries, facts in git
Canonical Engine	Engine-managed durability	WAL, checkpoint, manifest
Derived Projection	Non-authoritative	Exports, caches, compatibility files

Rules: Workspace → Engine (explicit load). Engine → Projection (explicit export). Projections never feed back as authority.

Module architecture#

Core (bottom of stack):

Layer	Responsibility
Core types	Node, relationship, property primitives; confidence and evidence types
Graph engine	Graph store, property index, adjacency structures, incremental computation, standing queries, window operators, live algorithms
Storage	Journaled storage, WAL, snapshot/restore

Query and incremental (middle):

Layer	Responsibility
Query IR	Compiled query representation — the target for the query compiler
Query compiler	WorldCypher (ISO GQL) parser, query planning, incrementalization
Runtime	Execution engine, ArcFlow Adaptive Dispatch, GPU kernels

Public API (top of stack):

Surface	Responsibility
Rust SDK	Published as `arcflow` on crates.io
CLI	REPL, TCP/HTTP/PostgreSQL servers, self-update, structured output — user-facing binary: `arcflow`
FFI	C ABI for Python, TypeScript, and C++ bindings
MCP	Model Context Protocol server (stdio JSON-RPC)
WASM	Browser and edge runtime

Dependencies flow inward. Transport/CLI depend on core, never reverse.

Why This Matters#

A typical knowledge-graph stack requires 4-6 services: a graph database, a vector store, an analytics engine, a job runner, a cache, and a message bus. Each introduces serialization overhead, operational complexity, and failure modes. ArcFlow collapses this to one process and one binary — graph storage, incremental computation, vector search, and PostgreSQL wire protocol compatibility all in the same unified address space.

For AI workloads, in-process execution means a GraphRAG query can traverse the graph, run vector similarity, execute PageRank, and score confidence — all without a single network call or data format conversion. The unified in-process architecture removes every serialization boundary between those steps.

Three execution innovations sit at the core of this performance:

ArcFlow Graph Kernel — processes graph algorithms as a single parallel pass across all nodes, not sequential traversal
ArcFlow Adaptive Dispatch — routes each operation to the fastest available hardware (CPU, Metal, CUDA) via a live cost model at runtime
ArcFlow GPU Index — a pointer-free spatial index that transfers directly to GPU memory, enabling high-density spatial queries at GPU speed

Forward vision: The unified address space is the foundation for in-process AI inference, real-time sensor fusion, and perception pipelines where latency budgets are measured in microseconds, not milliseconds. Same architecture, same query language, expanded compute fabric across CPU, CUDA, and Metal.

Architecture

Design Principles#

Three-Plane Architecture#

Module architecture#

Why This Matters#

See Also#

Architecture

Design Principles#

Three-Plane Architecture#

Module architecture#

Why This Matters#

See Also#