Recipe: Batch Projection

Project entities and relationships into the world model at scale. MERGE semantics make every batch idempotent — replay the same data and the graph converges, never duplicates.

Basic batch#

const mutations = [
  "MERGE (p:Person {id: 'p1', name: 'Alice'})",
  "MERGE (p:Person {id: 'p2', name: 'Bob'})",
  "MERGE (o:Org {id: 'o1', name: 'Acme'})",
]
const count = db.batchMutate(mutations)
// count === 3

Entity + relationship batch#

Split into two phases for reliable linking:

// Phase 1: Create all nodes
db.batchMutate([
  "MERGE (p:Person {id: 'p1', name: 'Alice', workspaceId: 'ws1'})",
  "MERGE (o:Org {id: 'o1', name: 'Acme', workspaceId: 'ws1'})",
  "MERGE (f:Fact {uuid: 'f1', predicate: 'employment', confidence: 0.87})",
])
 
// Phase 2: Create all relationships (nodes guaranteed to exist)
db.batchMutate([
  "MATCH (p:Person {id: 'p1'}) MATCH (f:Fact {uuid: 'f1'}) MERGE (p)-[:SUBJECT_OF]->(f)",
  "MATCH (f:Fact {uuid: 'f1'}) MATCH (o:Org {id: 'o1'}) MERGE (f)-[:OBJECT_IS]->(o)",
])

Pipeline function#

interface PipelineEntity {
  label: string
  id: string
  properties: Record<string, string | number | boolean>
}
 
interface PipelineRelation {
  fromLabel: string
  fromId: string
  toLabel: string
  toId: string
  relType: string
}
 
function projectBatch(db: ArcflowDB, entities: PipelineEntity[], relations: PipelineRelation[]) {
  // Phase 1: Entities
  const entityMutations = entities.map(e => {
    const props = Object.entries(e.properties)
      .map(([k, v]) => `${k}: ${typeof v === 'string' ? `'${v}'` : v}`)
      .join(', ')
    return `MERGE (n:${e.label} {id: '${e.id}', ${props}})`
  })
  db.batchMutate(entityMutations)
 
  // Phase 2: Relationships
  const relMutations = relations.map(r =>
    `MATCH (a:${r.fromLabel} {id: '${r.fromId}'}) MATCH (b:${r.toLabel} {id: '${r.toId}'}) MERGE (a)-[:${r.relType}]->(b)`
  )
  if (relMutations.length > 0) {
    db.batchMutate(relMutations)
  }
}

Performance notes#

batchMutate executes all queries under a single write lock — no per-query locking overhead
MERGE is safe for re-runs — idempotent by design
For very large batches (10K+), split into chunks of ~1000 mutations

Recipe: Batch Projection

Basic batch#

Entity + relationship batch#

Pipeline function#

Performance notes#

See Also#

Recipe: Batch Projection

Basic batch#

Entity + relationship batch#

Pipeline function#

Performance notes#

See Also#