Trusted RAG

Standard RAG retrieves flat chunks. The LLM gets text — no confidence level, no provenance trail, no structure showing how facts connect.

Trusted RAG runs on a world model. Every document and fact is a graph node with a confidence score (_confidence), an observation class (_observation_class: 'observed' | 'inferred' | 'predicted'), and a provenance edge back to its source. Retrieval follows relationship paths, not just vector proximity. The LLM receives not just what was found but how confident the system is in it — and why.

ArcFlow handles the retrieval side: vector search, graph traversal, and confidence filtering compose in one engine, one query, no joins across services.

User query
    │
    ▼
[1] Embed query ────────────────────────► Embedding model (external)
    │
    ▼
[2] Vector search ──────────────────────► ArcFlow vector index
    │
    ▼
[3] Graph context + confidence filter ──► Traverse relationships, filter by _confidence
    │
    ▼
[4] Assemble prompt ────────────────────► Context with confidence scores + citations
    │
    ▼
[5] LLM generation ─────────────────────► Response grounded in trusted facts

ArcFlow owns steps 2 and 3. Steps 1, 4, and 5 are external.

Step 1: Store documents as confidence-scored nodes#

Each document (or chunk) is a graph node with an embedding, a confidence score, and an observation class:

CREATE (d:Document {
  title: 'Rust Ownership Model',
  content: 'Rust uses ownership rules to manage memory without garbage collection...',
  source: 'rust-book-ch4',
  _confidence: 0.98,
  _observation_class: 'observed',
  embedding: [0.12, -0.34, 0.56, 0.78, -0.91]
})

For chunked documents, create a parent-child structure:

-- Parent document
CREATE (doc:Document {
  title: 'Rust Book Chapter 4',
  source: 'rust-book',
  _confidence: 0.99,
  _observation_class: 'observed'
})
 
-- Chunks with embeddings
CREATE (c1:Chunk {
  text: 'Each value in Rust has a variable that is its owner...',
  position: 0,
  _confidence: 0.98,
  _observation_class: 'observed',
  embedding: [0.12, -0.34, 0.56]
})
 
CREATE (c2:Chunk {
  text: 'When the owner goes out of scope, the value is dropped...',
  position: 1,
  _confidence: 0.97,
  _observation_class: 'observed',
  embedding: [0.15, -0.28, 0.61]
})
 
-- Link chunks to document with provenance
MATCH (doc:Document {title: 'Rust Book Chapter 4'}), (c1:Chunk {position: 0}), (c2:Chunk {position: 1})
CREATE (doc)-[:HAS_CHUNK]->(c1)
CREATE (doc)-[:HAS_CHUNK]->(c2)

Link related concepts across chunks — this is what makes graph retrieval different from flat retrieval:

MATCH (c1:Chunk {position: 0}), (c2:Chunk {position: 1})
CREATE (c1)-[:RELATED_TO {weight: 0.85}]->(c2)

Step 2: Create a vector index#

CREATE VECTOR INDEX chunk_embeddings
  ON :Chunk(embedding)
  DIMENSIONS 1536
  SIMILARITY cosine

The index is built in-memory, persisted via WAL, and supports cosine similarity, euclidean distance, and dot product. Same engine as the graph — no separate service.

Step 3: Retrieve by vector similarity#

-- Find the 10 chunks most similar to a query vector
CALL algo.vectorSearch('chunk_embeddings', $queryVector, 10)
  YIELD node AS chunk, score
WHERE chunk._confidence > 0.7
  AND chunk._observation_class IN ['observed', 'inferred']
RETURN chunk.text, score, chunk._confidence, chunk._observation_class
ORDER BY score DESC

Confidence filter composes directly with vector search — single query, no post-processing join.

Step 4: Enrich with graph context#

After finding similar chunks, traverse the graph to collect structured context:

-- For each matched chunk, collect neighbors and their confidence
CALL algo.vectorSearch('chunk_embeddings', $queryVector, 5)
  YIELD node AS chunk, score
 
OPTIONAL MATCH (chunk)-[:RELATED_TO]->(related:Chunk)
OPTIONAL MATCH (doc:Document)-[:HAS_CHUNK]->(chunk)
 
RETURN
  chunk.text AS match,
  score AS similarity,
  chunk._confidence AS chunk_confidence,
  doc.title AS source,
  doc.source AS source_id,
  collect(related.text) AS related_context
ORDER BY score DESC

The LLM receives the matched chunk, its parent document (provenance), and related chunks (context). Not just text — structure.

Step 5: Confidence-filtered retrieval#

For safety-critical or high-stakes retrieval, filter to only observed facts above a confidence threshold:

-- Only retrieve high-confidence observed facts
CALL algo.vectorSearch('chunk_embeddings', $queryVector, 20)
  YIELD node AS chunk, score
 
WHERE chunk._observation_class = 'observed'
  AND chunk._confidence > 0.85
 
OPTIONAL MATCH (doc:Document)-[:HAS_CHUNK]->(chunk)
 
RETURN
  chunk.text,
  score,
  chunk._confidence,
  doc.title AS source,
  doc.source AS citation
ORDER BY (score * chunk._confidence) DESC
LIMIT 5

Ranking by score * _confidence surfaces results that are both semantically similar and epistemically trustworthy.

Step 6: Full-text + vector + graph in one query#

ArcFlow's three search modalities compose in a single query:

-- Full-text match for exact term, vector for semantic similarity, graph for context
CALL db.index.fulltext.queryNodes('chunk_ft', $term)
  YIELD node AS ft_result, score AS ft_score
 
CALL algo.vectorSearch('chunk_embeddings', $queryVector, 10)
  YIELD node AS vec_result, score AS vec_score
 
WITH ft_result, ft_score, vec_result, vec_score
WHERE ft_result.id = vec_result.id  -- same chunk matched both ways
 
OPTIONAL MATCH (doc:Document)-[:HAS_CHUNK]->(ft_result)
 
RETURN
  ft_result.text,
  ft_score,
  vec_score,
  ft_result._confidence,
  doc.title AS source
ORDER BY (ft_score + vec_score) DESC

Step 7: Connect to agents#

napi-rs (Node.js — fastest path)#

import { open } from '@ozinc/arcflow'
 
const db = open('./data/knowledge-graph')
 
async function trustedRetrieval(queryVector: number[], minConfidence = 0.8) {
  return db.query(
    `CALL algo.vectorSearch('chunk_embeddings', $vec, 10)
       YIELD node AS chunk, score
     WHERE chunk._observation_class = 'observed'
       AND chunk._confidence > $minConf
     OPTIONAL MATCH (doc:Document)-[:HAS_CHUNK]->(chunk)
     RETURN chunk.text, score, chunk._confidence, doc.title AS source
     ORDER BY (score * chunk._confidence) DESC
     LIMIT 5`,
    { vec: JSON.stringify(queryVector), minConf: minConfidence }
  )
}

CLI (coding agents — Claude Code, Codex, Gemini CLI)#

arcflow query "
  CALL algo.vectorSearch('chunk_embeddings', $vec, 10)
  YIELD node AS chunk, score
  WHERE chunk._confidence > 0.8
  RETURN chunk.text, score, chunk._confidence
  ORDER BY score DESC
" --param vec='[0.12, -0.34, 0.56]'

MCP (cloud chat UIs — ChatGPT, Claude.ai)#

npx arcflow-mcp --data-dir ./data/knowledge-graph

The MCP server's graph_rag tool runs the full retrieval pipeline. The LLM queries it directly without generating code.

Tips#

Confidence thresholds: Start at 0.7 for general retrieval. Raise to 0.85+ when precision matters more than recall. For safety-critical domains, only retrieve observed facts.

Observation class filtering: observed = direct source extraction; inferred = derived by reasoning; predicted = model output. For legal, medical, or financial RAG, restrict to observed only.

Graph structure scales value: The gap between flat vector retrieval and graph retrieval grows with relationship richness. A chunk graph with RELATED_TO, CITES, CONTRADICTS, and SUPPORTS edges gives the LLM reasoning structure that a chunk array cannot.

Embedding dimensions: ArcFlow's vector index supports any dimensionality: 384 (small sentence models), 768 (BERT-size models), 1536 (large text embedding models), 3072 (extended embedding models).

GPU acceleration: On large graphs, vector search dispatches automatically to CUDA or Metal when available. No code changes.

Trusted RAG

Step 1: Store documents as confidence-scored nodes#

Step 2: Create a vector index#

Step 3: Retrieve by vector similarity#

Step 4: Enrich with graph context#

Step 5: Confidence-filtered retrieval#

Step 6: Full-text + vector + graph in one query#

Step 7: Connect to agents#

napi-rs (Node.js — fastest path)#

CLI (coding agents — Claude Code, Codex, Gemini CLI)#

MCP (cloud chat UIs — ChatGPT, Claude.ai)#

Tips#

See Also#

Trusted RAG

Step 1: Store documents as confidence-scored nodes#

Step 2: Create a vector index#

Step 3: Retrieve by vector similarity#

Step 4: Enrich with graph context#

Step 5: Confidence-filtered retrieval#

Step 6: Full-text + vector + graph in one query#

Step 7: Connect to agents#

napi-rs (Node.js — fastest path)#

CLI (coding agents — Claude Code, Codex, Gemini CLI)#

MCP (cloud chat UIs — ChatGPT, Claude.ai)#

Tips#

See Also#