Trusted RAG
Standard RAG retrieves flat chunks. The LLM gets text — no confidence level, no provenance trail, no structure showing how facts connect.
Trusted RAG runs on a world model. Every document and fact is a graph node with a confidence score (_confidence), an observation class (_observation_class: 'observed' | 'inferred' | 'predicted'), and a provenance edge back to its source. Retrieval follows relationship paths, not just vector proximity. The LLM receives not just what was found but how confident the system is in it — and why.
ArcFlow handles the retrieval side: vector search, graph traversal, and confidence filtering compose in one engine, one query, no joins across services.
User query
│
▼
[1] Embed query ────────────────────────► Embedding model (external)
│
▼
[2] Vector search ──────────────────────► ArcFlow vector index
│
▼
[3] Graph context + confidence filter ──► Traverse relationships, filter by _confidence
│
▼
[4] Assemble prompt ────────────────────► Context with confidence scores + citations
│
▼
[5] LLM generation ─────────────────────► Response grounded in trusted facts
ArcFlow owns steps 2 and 3. Steps 1, 4, and 5 are external.
Step 1: Store documents as confidence-scored nodes#
Each document (or chunk) is a graph node with an embedding, a confidence score, and an observation class:
CREATE (d:Document {
title: 'Rust Ownership Model',
content: 'Rust uses ownership rules to manage memory without garbage collection...',
source: 'rust-book-ch4',
_confidence: 0.98,
_observation_class: 'observed',
embedding: [0.12, -0.34, 0.56, 0.78, -0.91]
})For chunked documents, create a parent-child structure:
-- Parent document
CREATE (doc:Document {
title: 'Rust Book Chapter 4',
source: 'rust-book',
_confidence: 0.99,
_observation_class: 'observed'
})
-- Chunks with embeddings
CREATE (c1:Chunk {
text: 'Each value in Rust has a variable that is its owner...',
position: 0,
_confidence: 0.98,
_observation_class: 'observed',
embedding: [0.12, -0.34, 0.56]
})
CREATE (c2:Chunk {
text: 'When the owner goes out of scope, the value is dropped...',
position: 1,
_confidence: 0.97,
_observation_class: 'observed',
embedding: [0.15, -0.28, 0.61]
})
-- Link chunks to document with provenance
MATCH (doc:Document {title: 'Rust Book Chapter 4'}), (c1:Chunk {position: 0}), (c2:Chunk {position: 1})
CREATE (doc)-[:HAS_CHUNK]->(c1)
CREATE (doc)-[:HAS_CHUNK]->(c2)Link related concepts across chunks — this is what makes graph retrieval different from flat retrieval:
MATCH (c1:Chunk {position: 0}), (c2:Chunk {position: 1})
CREATE (c1)-[:RELATED_TO {weight: 0.85}]->(c2)Step 2: Create a vector index#
CREATE VECTOR INDEX chunk_embeddings
ON :Chunk(embedding)
DIMENSIONS 1536
SIMILARITY cosineThe index is built in-memory, persisted via WAL, and supports cosine similarity, euclidean distance, and dot product. Same engine as the graph — no separate service.
Step 3: Retrieve by vector similarity#
-- Find the 10 chunks most similar to a query vector
CALL algo.vectorSearch('chunk_embeddings', $queryVector, 10)
YIELD node AS chunk, score
WHERE chunk._confidence > 0.7
AND chunk._observation_class IN ['observed', 'inferred']
RETURN chunk.text, score, chunk._confidence, chunk._observation_class
ORDER BY score DESCConfidence filter composes directly with vector search — single query, no post-processing join.
Step 4: Enrich with graph context#
After finding similar chunks, traverse the graph to collect structured context:
-- For each matched chunk, collect neighbors and their confidence
CALL algo.vectorSearch('chunk_embeddings', $queryVector, 5)
YIELD node AS chunk, score
OPTIONAL MATCH (chunk)-[:RELATED_TO]->(related:Chunk)
OPTIONAL MATCH (doc:Document)-[:HAS_CHUNK]->(chunk)
RETURN
chunk.text AS match,
score AS similarity,
chunk._confidence AS chunk_confidence,
doc.title AS source,
doc.source AS source_id,
collect(related.text) AS related_context
ORDER BY score DESCThe LLM receives the matched chunk, its parent document (provenance), and related chunks (context). Not just text — structure.
Step 5: Confidence-filtered retrieval#
For safety-critical or high-stakes retrieval, filter to only observed facts above a confidence threshold:
-- Only retrieve high-confidence observed facts
CALL algo.vectorSearch('chunk_embeddings', $queryVector, 20)
YIELD node AS chunk, score
WHERE chunk._observation_class = 'observed'
AND chunk._confidence > 0.85
OPTIONAL MATCH (doc:Document)-[:HAS_CHUNK]->(chunk)
RETURN
chunk.text,
score,
chunk._confidence,
doc.title AS source,
doc.source AS citation
ORDER BY (score * chunk._confidence) DESC
LIMIT 5Ranking by score * _confidence surfaces results that are both semantically similar and epistemically trustworthy.
Step 6: Full-text + vector + graph in one query#
ArcFlow's three search modalities compose in a single query:
-- Full-text match for exact term, vector for semantic similarity, graph for context
CALL db.index.fulltext.queryNodes('chunk_ft', $term)
YIELD node AS ft_result, score AS ft_score
CALL algo.vectorSearch('chunk_embeddings', $queryVector, 10)
YIELD node AS vec_result, score AS vec_score
WITH ft_result, ft_score, vec_result, vec_score
WHERE ft_result.id = vec_result.id -- same chunk matched both ways
OPTIONAL MATCH (doc:Document)-[:HAS_CHUNK]->(ft_result)
RETURN
ft_result.text,
ft_score,
vec_score,
ft_result._confidence,
doc.title AS source
ORDER BY (ft_score + vec_score) DESCStep 7: Connect to agents#
napi-rs (Node.js — fastest path)#
import { open } from 'arcflow'
const db = open('./data/knowledge-graph')
async function trustedRetrieval(queryVector: number[], minConfidence = 0.8) {
return db.query(
`CALL algo.vectorSearch('chunk_embeddings', $vec, 10)
YIELD node AS chunk, score
WHERE chunk._observation_class = 'observed'
AND chunk._confidence > $minConf
OPTIONAL MATCH (doc:Document)-[:HAS_CHUNK]->(chunk)
RETURN chunk.text, score, chunk._confidence, doc.title AS source
ORDER BY (score * chunk._confidence) DESC
LIMIT 5`,
{ vec: JSON.stringify(queryVector), minConf: minConfidence }
)
}CLI (coding agents — Claude Code, Codex, Gemini CLI)#
arcflow query "
CALL algo.vectorSearch('chunk_embeddings', $vec, 10)
YIELD node AS chunk, score
WHERE chunk._confidence > 0.8
RETURN chunk.text, score, chunk._confidence
ORDER BY score DESC
" --param vec='[0.12, -0.34, 0.56]'MCP (cloud chat UIs — ChatGPT, Claude.ai)#
npx arcflow-mcp --data-dir ./data/knowledge-graphThe MCP server's graph_rag tool runs the full retrieval pipeline. The LLM queries it directly without generating code.
Tips#
Confidence thresholds: Start at 0.7 for general retrieval. Raise to 0.85+ when precision matters more than recall. For safety-critical domains, only retrieve observed facts.
Observation class filtering: observed = direct source extraction; inferred = derived by reasoning; predicted = model output. For legal, medical, or financial RAG, restrict to observed only.
Graph structure scales value: The gap between flat vector retrieval and graph retrieval grows with relationship richness. A chunk graph with RELATED_TO, CITES, CONTRADICTS, and SUPPORTS edges gives the LLM reasoning structure that a chunk array cannot.
Embedding dimensions: ArcFlow's vector index supports any dimensionality: 384 (small sentence models), 768 (BERT-size models), 1536 (large text embedding models), 3072 (extended embedding models).
GPU acceleration: On large graphs, vector search dispatches automatically to CUDA or Metal when available. No code changes.
See Also#
- Trusted RAG — capability reference with confidence algebra and evidence chains
- Vector Search — vector index configuration and query patterns
- Procedures —
algo.vectorSearch,algo.graphRAGTrusted,algo.graphRAGContext - Confidence & Provenance — scoring model and provenance edges
- Building a World Model — the foundational pattern