ArcFlow
Company
Managed Services
Markets
  • News
  • LOG IN
  • GET STARTED

OZ brings Visual Intelligence to physical venues, a managed edge layer that lets real-world environments see, understand, and act in real time.

Talk to us

ArcFlow

  • World Models
  • Sensors

Managed Services

  • OZ VI Venue 1
  • Case Studies

Markets

  • Sports
  • Broadcasting
  • Robotics

Company

  • About
  • Technology
  • Careers
  • Contact

Ready to see it live?

Talk to the OZ team about deploying at your venues, from a single pilot match to a full regional rollout.

Schedule a deployment review

© 2026 OZ. All rights reserved.

LinkedIn
ArcFlow Docs
Start
  • Quickstart
  • Installation
  • Bindings
  • Platforms
  • Get Started
  • Cookbook
Concepts
  • World Model
  • Graph Model
  • Evidence Model
  • Observations
  • Confidence & Provenance
  • Proof Artifacts & Gates
  • SQL vs GQL
  • Graph Patterns
  • Parameters
  • Query Results
  • Persistence & WAL
  • Snapshot-Pinned Reads
  • Error Handling
  • Execution Models
  • Causal Edges
  • Adapter Discipline
  • Time Decay
  • Layers
  • 1. World Store
  • 1a. World Store · Smart Reader
  • 2. Perception Lake
  • 3. World Graph
  • 4. Query Engine
  • 5. Live Surface
  • 6. Event Bus
  • 7. Behavior Engine
  • 8. Algorithm Library
  • Virtual Computed Columns
  • Threading Model
  • Typed ID Contract
WorldCypher
  • Overview
  • Execution Options
  • Statements
  • MATCH
  • WHERE
  • RETURN
  • OPTIONAL MATCH
  • CREATE
  • SET
  • MERGE
  • DELETE
  • REMOVE
  • Composition
  • WITH
  • UNION
  • UNWIND
  • CASE
  • Schema
  • Schema Overview
  • Indexes
  • Constraints
  • Functions
  • Built-in Functions
  • Aggregations
  • Procedures
  • Shortest Path
  • EXPLAIN
  • PROFILE
  • Temporal Queriesfacet
  • Spatial Queriesfacet
  • Algorithmsfacet
  • Triggers
Capabilities
  • Live Queries
  • Vector Search
  • Trusted RAG
  • Spatial Knowledge
  • Temporal
  • Behavior Graphs
  • Graph Algorithms
  • Skills
  • CREATE SKILL
  • PROCESS NODE
  • REPROCESS EDGES
  • Sync
  • Programs
  • GPU Acceleration
  • Agent-Native
  • MCP Server
  • Event Sourcing
  • Intent Relay
  • Event Bus
Use Cases
  • Agent Tooling
  • Trusted RAG
  • Knowledge Management
  • Behavior Graphs
  • Autonomous Systems
  • Physical AI
  • Digital Twins
  • Robotics & Perception
  • Sports Analytics
  • Grounded Neural Objects
  • Fraud Detection
Walkthroughs
    Guides
  • Agent Integration
  • Building a World Model
  • Modeling a Social Graph
  • Build a RAG Pipeline
  • Using Skills
  • Behavior Graphs
  • Swarm & Multi-Agent
  • Fleet Coordination
  • Migrate from Cypher / Neo4j
  • From SQL to GQL
  • Filesystem Workspace
  • Data Quality
  • Code Intelligence
  • Scale Patterns
  • v0.7 → v0.8 Lakehouse Fast-Path
  • Tutorials
  • Knowledge Graph
  • Entity Linking
  • Vector Search
  • Graph Algorithms
  • Recipes
  • CRUD
  • Multi-MATCH
  • MERGE (Upsert)
  • Full-Text Search
  • Batch Projection
  • Multi-Source Observation
  • Sports Analytics
Operations
  • CLI
  • REPL Commands
  • Snapshot & Restore
  • Filesystem Projection
  • Plugin Management
  • Agent Governance
  • Server Modes & PG Wire
  • Persistence (ops)
  • Import & Export
  • Deployment
  • Deployment Modes
  • Daemon (UDS)
  • Why not Docker
  • Architecture
  • Engine Architecture
  • Cloud Architecture
  • Sync Protocol (Deep Dive)
  • World Graph Substrate (Preview)
Reference
  • TypeScript API
  • Glossary
  • Naming & Domain Map
  • Data Types
  • Operators
  • Error Codes
  • GQL Reference
  • Known Issues
  • Versioning
  • Licensing
  • Conformance
  • GQL Conformance
  • openCypher TCK
  • Extension Regressions
GQL Reference
    Conformance
  • Conformance Dashboard
  • openCypher TCK Results
  • Extension Regressions
  • Features
  • MATCH Basic
  • CREATE Nodes Edges
  • SET REMOVE Properties
  • DELETE Detach DELETE
  • RETURN WITH WHERE
  • Order BY Limit Skip
  • Order BY Nulls First Last
  • UNWIND
  • Aggregate Functions
  • OPTIONAL MATCH
  • Variable Length Paths
  • Label OR AND NOT Expressions
  • Label Wildcard
  • Quantified Path Sugar
  • Path Modes Walk Trail Simple Acyclic
  • Shortest Path Variants
  • IS Labeled Predicate
  • Element ID Function
  • IS Type Predicate
  • Binary Literals
  • Line Comments Solidus
  • Line Comments Minus
  • GQLSTATUS Result Codes
  • GQL Error Code Mapping
  • Transaction Control Syntax
  • SET Session
  • Conditional Execution WHEN THEN ELSE
  • RETURN NEXT Pipeline
  • Primary Key Constraint
  • Unique Constraint
  • Deterministic MERGE Via PK
  • Undirected Edge MATCH
  • Cast Type Conversion
  • GQL Directories
  • Multiple Labels Per Node
  • GQL Flagger
  • NEXT Linear Composition
  • Cardinality Function
  • INT64 BIGINT Type Names
  • FLOAT64 Double Type Names
  • Log10 Log2 Functions
  • Trim Leading Trailing Both
  • FILTER Clause
  • LET Statement
  • Group BY Explicit
  • EXCEPT SET Operations
  • INTERSECT SET Operations
  • ALL Different Predicate
  • Same Predicate
  • Property Exists Function
  • Path Variable Binding
  • USE Graph Clause
  • FOR IN List
  • Typed Temporal Literals
  • Session SET Value Params
  • Typed List Annotations
  • arcflow.cosine() function
  • arcflow.embed() function
  • arcflow.similar() procedure
  • arcflow.graphrag() procedure
  • ArcFlow Extensions
  • LIVE Queries
  • Triggered Write-Back Views
  • Evidence Algebra
  • Relationship Skills
  • AI Function Namespace
  • Graph Embedding Algorithms
  • ASOF JOIN
  • Durable Workflows
  • Incremental Z-Set Engine
  • GPU GraphBLAS
  • Triggers
  • HNSW Vector Index
  • Extensions Moat

Trusted RAG

Standard RAG retrieves flat chunks. The LLM gets text — no confidence level, no provenance trail, no structure showing how facts connect.

Trusted RAG runs on a world model. Every document and fact is a graph node with a confidence score (_confidence), an observation class (_observation_class: 'observed' | 'inferred' | 'predicted'), and a provenance edge back to its source. Retrieval follows relationship paths, not just vector proximity. The LLM receives not just what was found but how confident the system is in it — and why.

ArcFlow handles the retrieval side: vector search, graph traversal, and confidence filtering compose in one engine, one query, no joins across services.

User query
    │
    ▼
[1] Embed query ────────────────────────► Embedding model (external)
    │
    ▼
[2] Vector search ──────────────────────► ArcFlow vector index
    │
    ▼
[3] Graph context + confidence filter ──► Traverse relationships, filter by _confidence
    │
    ▼
[4] Assemble prompt ────────────────────► Context with confidence scores + citations
    │
    ▼
[5] LLM generation ─────────────────────► Response grounded in trusted facts

ArcFlow owns steps 2 and 3. Steps 1, 4, and 5 are external.


Step 1: Store documents as confidence-scored nodes#

Each document (or chunk) is a graph node with an embedding, a confidence score, and an observation class:

CREATE (d:Document {
  title: 'Rust Ownership Model',
  content: 'Rust uses ownership rules to manage memory without garbage collection...',
  source: 'rust-book-ch4',
  _confidence: 0.98,
  _observation_class: 'observed',
  embedding: [0.12, -0.34, 0.56, 0.78, -0.91]
})

For chunked documents, create a parent-child structure:

-- Parent document
CREATE (doc:Document {
  title: 'Rust Book Chapter 4',
  source: 'rust-book',
  _confidence: 0.99,
  _observation_class: 'observed'
})
 
-- Chunks with embeddings
CREATE (c1:Chunk {
  text: 'Each value in Rust has a variable that is its owner...',
  position: 0,
  _confidence: 0.98,
  _observation_class: 'observed',
  embedding: [0.12, -0.34, 0.56]
})
 
CREATE (c2:Chunk {
  text: 'When the owner goes out of scope, the value is dropped...',
  position: 1,
  _confidence: 0.97,
  _observation_class: 'observed',
  embedding: [0.15, -0.28, 0.61]
})
 
-- Link chunks to document with provenance
MATCH (doc:Document {title: 'Rust Book Chapter 4'}), (c1:Chunk {position: 0}), (c2:Chunk {position: 1})
CREATE (doc)-[:HAS_CHUNK]->(c1)
CREATE (doc)-[:HAS_CHUNK]->(c2)

Link related concepts across chunks — this is what makes graph retrieval different from flat retrieval:

MATCH (c1:Chunk {position: 0}), (c2:Chunk {position: 1})
CREATE (c1)-[:RELATED_TO {weight: 0.85}]->(c2)

Step 2: Create a vector index#

CREATE VECTOR INDEX chunk_embeddings
  ON :Chunk(embedding)
  DIMENSIONS 1536
  SIMILARITY cosine

The index is built in-memory, persisted via WAL, and supports cosine similarity, euclidean distance, and dot product. Same engine as the graph — no separate service.

Step 3: Retrieve by vector similarity#

-- Find the 10 chunks most similar to a query vector
CALL algo.vectorSearch('chunk_embeddings', $queryVector, 10)
  YIELD node AS chunk, score
WHERE chunk._confidence > 0.7
  AND chunk._observation_class IN ['observed', 'inferred']
RETURN chunk.text, score, chunk._confidence, chunk._observation_class
ORDER BY score DESC

Confidence filter composes directly with vector search — single query, no post-processing join.

Step 4: Enrich with graph context#

After finding similar chunks, traverse the graph to collect structured context:

-- For each matched chunk, collect neighbors and their confidence
CALL algo.vectorSearch('chunk_embeddings', $queryVector, 5)
  YIELD node AS chunk, score
 
OPTIONAL MATCH (chunk)-[:RELATED_TO]->(related:Chunk)
OPTIONAL MATCH (doc:Document)-[:HAS_CHUNK]->(chunk)
 
RETURN
  chunk.text AS match,
  score AS similarity,
  chunk._confidence AS chunk_confidence,
  doc.title AS source,
  doc.source AS source_id,
  collect(related.text) AS related_context
ORDER BY score DESC

The LLM receives the matched chunk, its parent document (provenance), and related chunks (context). Not just text — structure.

Step 5: Confidence-filtered retrieval#

For safety-critical or high-stakes retrieval, filter to only observed facts above a confidence threshold:

-- Only retrieve high-confidence observed facts
CALL algo.vectorSearch('chunk_embeddings', $queryVector, 20)
  YIELD node AS chunk, score
 
WHERE chunk._observation_class = 'observed'
  AND chunk._confidence > 0.85
 
OPTIONAL MATCH (doc:Document)-[:HAS_CHUNK]->(chunk)
 
RETURN
  chunk.text,
  score,
  chunk._confidence,
  doc.title AS source,
  doc.source AS citation
ORDER BY (score * chunk._confidence) DESC
LIMIT 5

Ranking by score * _confidence surfaces results that are both semantically similar and epistemically trustworthy.

Step 6: Full-text + vector + graph in one query#

ArcFlow's three search modalities compose in a single query:

-- Full-text match for exact term, vector for semantic similarity, graph for context
CALL db.index.fulltext.queryNodes('chunk_ft', $term)
  YIELD node AS ft_result, score AS ft_score
 
CALL algo.vectorSearch('chunk_embeddings', $queryVector, 10)
  YIELD node AS vec_result, score AS vec_score
 
WITH ft_result, ft_score, vec_result, vec_score
WHERE ft_result.id = vec_result.id  -- same chunk matched both ways
 
OPTIONAL MATCH (doc:Document)-[:HAS_CHUNK]->(ft_result)
 
RETURN
  ft_result.text,
  ft_score,
  vec_score,
  ft_result._confidence,
  doc.title AS source
ORDER BY (ft_score + vec_score) DESC

Step 7: Connect to agents#

napi-rs (Node.js — fastest path)#

import { open } from '@ozinc/arcflow'
 
const db = open('./data/knowledge-graph')
 
async function trustedRetrieval(queryVector: number[], minConfidence = 0.8) {
  return db.query(
    `CALL algo.vectorSearch('chunk_embeddings', $vec, 10)
       YIELD node AS chunk, score
     WHERE chunk._observation_class = 'observed'
       AND chunk._confidence > $minConf
     OPTIONAL MATCH (doc:Document)-[:HAS_CHUNK]->(chunk)
     RETURN chunk.text, score, chunk._confidence, doc.title AS source
     ORDER BY (score * chunk._confidence) DESC
     LIMIT 5`,
    { vec: JSON.stringify(queryVector), minConf: minConfidence }
  )
}

CLI (coding agents — Claude Code, Codex, Gemini CLI)#

arcflow query "
  CALL algo.vectorSearch('chunk_embeddings', $vec, 10)
  YIELD node AS chunk, score
  WHERE chunk._confidence > 0.8
  RETURN chunk.text, score, chunk._confidence
  ORDER BY score DESC
" --param vec='[0.12, -0.34, 0.56]'

MCP (cloud chat UIs — ChatGPT, Claude.ai)#

npx arcflow-mcp --data-dir ./data/knowledge-graph

The MCP server's graph_rag tool runs the full retrieval pipeline. The LLM queries it directly without generating code.


Tips#

Confidence thresholds: Start at 0.7 for general retrieval. Raise to 0.85+ when precision matters more than recall. For safety-critical domains, only retrieve observed facts.

Observation class filtering: observed = direct source extraction; inferred = derived by reasoning; predicted = model output. For legal, medical, or financial RAG, restrict to observed only.

Graph structure scales value: The gap between flat vector retrieval and graph retrieval grows with relationship richness. A chunk graph with RELATED_TO, CITES, CONTRADICTS, and SUPPORTS edges gives the LLM reasoning structure that a chunk array cannot.

Embedding dimensions: ArcFlow's vector index supports any dimensionality: 384 (small sentence models), 768 (BERT-size models), 1536 (large text embedding models), 3072 (extended embedding models).

GPU acceleration: On large graphs, vector search dispatches automatically to CUDA or Metal when available. No code changes.

See Also#

  • Trusted RAG — capability reference with confidence algebra and evidence chains
  • Vector Search — vector index configuration and query patterns
  • Procedures — algo.vectorSearch, algo.graphRAGTrusted, algo.graphRAGContext
  • Confidence & Provenance — scoring model and provenance edges
  • Building a World Model — the foundational pattern
Try it
Open ↗⌘↵ to run
Loading engine…
← PreviousModeling a Social GraphNext →Using Skills