Migration Guide
If you've worked with Cypher-style or other graph query systems before, most of what you know transfers. ArcFlow's WorldCypher is Cypher-inspired; your existing MATCH, CREATE, MERGE, and WHERE clauses usually carry over directly. What changes: the deployment model (in-process or lightweight server), built-in GPU-accelerated algorithms, confidence scoring, temporal queries, and vector search with no add-ons.
This guide covers what stays the same, what's different, and how to move your data.
Architecture Differences#
| Aspect | Typical Graph Engine | ArcFlow |
|---|---|---|
| Deployment | Usually server-based or embedded depending on stack | In-process library or lightweight server mode |
| Runtime | Varies by implementation | Native binary, no runtime dependency |
| Memory model | Off-heap page cache or shared memory | In-process shared memory (SoC architecture) |
| Serialization | Wire protocol or RPC boundary | None (in-process) or TCP/HTTP |
| Query language | Varies by graph stack | WorldCypher (ArcFlow's GQL dialect with spatial/temporal/reactive extensions) |
| Indexes | B-tree, full-text, vector (varies by vendor) | Property index, HNSW vector index |
| Transactions | Full ACID (most vendors) | WAL-based journaling |
| Clustering | Causal clustering, replication (varies) | Single-node (replication planned) |
| GPU acceleration | None | CUDA + Metal (17 kernels each) |
| Vector search | Add-on or separate product (varies) | HNSW index + algo.graphRAG built-in |
| Temporal | Varies by implementation | Built-in: AS OF, clock domains, temporal replay |
| Confidence scoring | None | Built-in: 0.0-1.0 per node/edge |
ArcFlow can run as an in-process engine with zero network overhead, or as a standalone server. The in-process mode eliminates serialization and connection management entirely.
GQL Compatibility#
If you're coming from a Cypher-family system, most of your queries port directly. If you're coming from a different graph query model, expect a query rewrite, but the graph modeling concepts still transfer.
Fully Compatible (GQL family)#
These Cypher features work identically in WorldCypher:
Statements: CREATE, MATCH, OPTIONAL MATCH, WHERE, RETURN, ORDER BY, SKIP, LIMIT, DELETE, DETACH DELETE, SET, REMOVE, MERGE, WITH, UNION, UNWIND, CASE WHEN
Predicates: AND, OR, NOT, XOR, =, <>, <, >, <=, >=, IS NULL, IS NOT NULL, IN, CONTAINS, STARTS WITH, ENDS WITH, EXISTS
Functions: id(), labels(), type(), count(), sum(), avg(), min(), max(), collect(), coalesce(), toString(), toInteger(), toFloat(), toLower(), toUpper(), trim(), substring(), left(), right(), replace(), split(), reverse()
Patterns: Node patterns (n:Label {prop: val}), relationship patterns -[:TYPE]->, variable-length paths [*1..5], shortestPath()
Schema: CREATE INDEX, CREATE CONSTRAINT (unique), DROP INDEX, DROP CONSTRAINT
WorldCypher Extensions#
Features in WorldCypher that go beyond standard GQL — ArcFlow-specific capabilities:
| Feature | Syntax | Purpose |
|---|---|---|
| Temporal snapshots | MATCH (n) AS OF 1700000000000 | Point-in-time queries |
| Confidence scoring | confidence(n), observationClass(n) | Trust metadata on every node/edge |
| Authority planes | authorityPlane(n) | Semantic vs scene classification |
| Clock domains | clockDomain(n) | Multi-clock temporal modeling |
| Observation source | observationSource(n) | Provenance tracking |
| Map projection | RETURN n {.name, .age} | Project specific properties |
| toJson | toJson(n) | Serialize node to JSON |
| hash | hash(n.prop) | FNV-1a deterministic hashing |
| Graph algorithms | CALL algo.pageRank() | 20+ built-in algorithms (no separate plugin) |
| PROCESS NODE | PROCESS NODE (n:Label) ... | Batch node processing with constructors |
| REPROCESS | REPROCESS | Re-run edge constructors |
| Vector search | CALL algo.vectorSearch() | HNSW nearest-neighbor search |
| GraphRAG | CALL algo.graphRAGTrusted() | Confidence-filtered RAG pipeline |
| EXPLAIN / PROFILE | EXPLAIN MATCH ... | Query plan introspection |
Not Yet Supported#
Standard GQL features or ecosystem-specific extensions not yet available in WorldCypher:
| Feature | Status | Alternative |
|---|---|---|
FOREACH | Not implemented | Use UNWIND + CREATE |
CALL {} IN TRANSACTIONS | Not applicable | In-process, no batching needed |
LOAD CSV | Different syntax | :import csv path/to/file.csv |
| Plugin/extension procedures | Not applicable | Built-in equivalents where needed |
SHOW DATABASES | Single database | CALL db.stats() |
Subqueries (CALL {}) | Not implemented | Use WITH for pipeline queries |
CREATE ... SET chaining | Use comma syntax | CREATE (n:Label {prop1: v1, prop2: v2}) |
| Multi-database | Not implemented | Single graph store |
| Role-based access | Planned | Currently single-user |
Performance Snapshot#
Recent ArcFlow benchmark runs on the current engine show strong single-node throughput in the embedded path:
| Workload | Current throughput |
|---|---|
| IS1: person profile | 10.6M/s |
| Property scan | 14.6M/s |
| Count(Person) | 43.7M/s |
| 3-hop traversal | 780.0K/s |
| Upsert (1K get_or_create) | 10.1M/s |
| Bulk insert + edges | 503.7K/s |
| Index build single | 151.3/s |
| Index build composite | 343.8/s |
For the exact benchmark methodology and the latest corrected report, see the engine benchmark artifacts in the main ArcFlow repository.
Migration Steps#
Step 1: Export from Your Current Database#
Export your data as Cypher statements, CSV, or JSON from your current stack. The exact command depends on your existing tooling, but the migration target is the same: node rows, relationship rows, and any schema definitions you want to recreate.
-- Most GQL-compatible databases support exporting as CREATE statements
-- Adapt vendor-specific export commands to your setupStep 2: Adapt the Export#
WorldCypher accepts most Cypher CREATE statements directly. Key adaptations:
- Remove environment-specific syntax (proprietary ID functions, internal properties)
- Replace
FOREACHwithUNWIND - Remove
CALL {} IN TRANSACTIONSwrappers - Convert non-Cypher query definitions into Cypher-style mutations where needed
- Keep CREATE, MERGE, and SET statements as-is
Step 3: Import into ArcFlow#
# Start ArcFlow CLI
arcflow --data-dir ./my-graph
# Import CSV files
:import csv /tmp/nodes.csvOr use CREATE statements directly:
-- Nodes import directly
CREATE (n:Person {name: 'Alice', age: 30})
CREATE (n:Person {name: 'Bob', age: 25})-- Relationships
MATCH (a:Person {name: 'Alice'}), (b:Person {name: 'Bob'})
CREATE (a)-[:KNOWS {since: 2020}]->(b)Step 4: Recreate Indexes#
CREATE INDEX ON :Person(name)
CREATE INDEX ON :Person(email)
CREATE CONSTRAINT ON (p:Person) ASSERT p.email IS UNIQUEStep 5: Verify#
-- Check counts match
CALL db.stats()-- Check schema
CALL db.schema()-- Run diagnostics
CALL db.doctor()Graph Analytics: Built In#
ArcFlow ships graph algorithms directly in the engine. No projection step, no separate installation.
| Common Analytics Procedure | ArcFlow Equivalent |
|---|---|
| PageRank stream | CALL algo.pageRank() |
| Betweenness centrality | CALL algo.betweenness() |
| Closeness centrality | CALL algo.closeness() |
| Degree centrality | CALL algo.degreeCentrality() |
| Weakly connected components | CALL algo.connectedComponents() |
| Louvain community detection | CALL algo.louvain() |
| Leiden community detection | CALL algo.leiden() |
| Local clustering coefficient | CALL algo.clusteringCoefficient() |
| Triangle count | CALL algo.triangleCount() |
| K-core decomposition | CALL algo.kCore() |
| Node similarity | CALL algo.nodeSimilarity() |
| All-pairs shortest path | CALL algo.allPairsShortestPath() |
| K-nearest neighbors | CALL algo.nearestNodes() |
Key difference: no projection step. With most graph analytics libraries, you first create an in-memory graph projection, run the algorithm, then optionally write results back. In ArcFlow, algorithms run directly on the live graph store with zero copy.
-- Traditional analytics workflow (3 steps):
-- 1. Project graph into analytics engine
-- 2. Run algorithm on projection
-- 3. Drop projection
-- ArcFlow (1 step):
CALL algo.pageRank()When to Use Which#
Use ArcFlow when:
- Embedding a graph engine in your application (no server management)
- You need GPU-accelerated graph algorithms
- You want built-in vector search + graph context (GraphRAG)
- Temporal queries and confidence scoring matter for your domain
- You want a single binary with no runtime dependency
Use a server-oriented graph platform when:
- You need multi-user concurrent access at scale
- Full ACID transactions are a hard requirement
- Your graph exceeds available RAM
- You need a large plugin ecosystem
- You need enterprise features (RBAC, causal clustering, CDC)
See Also#
- QuickStart -- get running in 5 minutes
- Built-in Functions -- all 83 functions
- Procedures -- 100+ procedures
- RAG Pipeline -- building RAG with ArcFlow