Use Case: Grounded Neural Objects
Neural world models generate detections, pose estimates, predicted future states. Their output is ephemeral — tensors, tracks, video latents. They do not remember. They do not version. They cannot tell you where a specific entity was at a specific timestamp with a specific confidence score from a specific sensor.
ArcFlow is the grounding layer. Real-world objects — players, vehicles, robots, equipment — lifted from neural outputs into persistent entities where each carries spatial position, per-dimension confidence, observation class, provenance, and temporal history. The gap between "a detection" and "a known entity in the world" is where most production systems break down. ArcFlow closes it.
The Problem#
Tracking systems produce detections. Detections are ephemeral — they appear, disappear, get new IDs after occlusions, and lose continuity across camera handoffs. The gap between "I detected something" and "I know what this object is, where it's been, and what it can do" is where most production systems break down.
World-model labs ask: how do objects emerge from pixels? Production systems ask: how do real objects stay identifiable, queryable, and actionable under occlusion, handoff, and sub-second control loops?
The Architecture#
Five layers, from physical reality to downstream actions:
1. Physical object#
The real thing: player, ball, referee, vehicle, robot, sensor, pitch zone.
2. Perceptual evidence plane#
Detection, pose estimation, keypoints, ReID embeddings, multicam association, triangulation. This layer produces evidence — timestamped, confidence-scored observations with provenance.
3. Grounded Neural Object (GNO)#
A persistent object handle tied to evidence. Not a vague embedding — a typed runtime contract:
interface GroundedNeuralObject {
gnoId: string // Persistent ID (distinct from detector track IDs)
physicalClass: string // 'player' | 'ball' | 'referee' | 'vehicle'
venueId: string // Deployment site
tickerId: number // Frame-aligned monotonic clock
// 3D state
pose3d: [number, number, number]
velocity: [number, number, number]
skeleton3d?: number[][]
// Identity
appearanceEmbedding: number[]
reidSignature: string
occlusionState: 'visible' | 'partial' | 'occluded' | 'lost'
visibilityByCamera: Record<string, number> // cameraId → confidence
// Confidence (per-dimension, not scalar)
confidence: {
position: number
identity: number
relation: number
}
// Evidence trail
evidenceSources: string[]
lineage: string[]
traceId: string
// Learned sidecar (rides alongside physical track)
sidecar: {
dynamicsLatent: number[]
affordanceScores: Record<string, number>
}
// Graph relationships
relations: Array<{
type: string
targetGnoId: string
confidence: number
}>
}Key rules:
gnoIdis distinct from detector track IDs — identity survives detector loss- Every update carries provenance and confidence
- Relations are first-class, not post-hoc annotations
- Downstream consumers (framing, zoom, analytics) consume the same object instance
4. ArcFlow runtime#
This is where tracking becomes knowledge:
import { open } from 'arcflow'
const db = open('./venue-graph')
// Mint a grounded neural object
db.mutate(`CREATE (p:Player {
gnoId: $gnoId,
physicalClass: 'player',
venueId: $venueId,
name: $name,
teamId: $teamId,
x: $x, y: $y, z: $z,
speed: $speed,
confidence_position: $confPos,
confidence_identity: $confId,
occlusionState: 'visible',
reidSignature: $reid
})`, {
gnoId: 'gno-player-10',
venueId: 'wembley',
name: 'Player 10',
teamId: 'home',
x: 52.3, y: 34.1, z: 0.0,
speed: 7.2,
confPos: 0.95,
confId: 0.92,
reid: 'reid-abc123'
})
// Create live relationship view: who's near the ball?
db.mutate(`CREATE LIVE VIEW near_ball AS
MATCH (p:Player) MATCH (b:Ball)
WHERE p.confidence_position > 0.8
RETURN p.gnoId, p.name, p.x, p.y, b.x AS ballX, b.y AS ballY
`)
// Update position (delta propagation — not full recompute)
db.mutate("MATCH (p:Player {gnoId: $id}) SET p.x = $x", {
id: 'gno-player-10', x: 53.1
})
// Live view updates automatically via CDC
// Query: who's nearest to the ball right now?
const nearest = db.query(`
MATCH (p:Player)
MATCH (b:Ball)
RETURN p.name, p.x, p.y, p.speed, p.confidence_position
ORDER BY p.confidence_position DESC
LIMIT 3
`)
// Temporal: where was player 10 at minute 35?
const historical = db.query(
"MATCH (p:Player {gnoId: $id}) AS OF $ts RETURN p.x, p.y, p.speed",
{ id: 'gno-player-10', ts: matchStartTimestamp + (35 * 60) }
)ArcFlow provides:
- Stable identity continuity —
gnoIdpersists through occlusions and camera handoffs - Relations as graph edges —
(:Player)-[:NEAR]->(:Ball)maintained in real time - Delta propagation — one position update triggers live view refresh, not full recompute
- Confidence propagation — per-dimension confidence flows through graph operations
- Batch/incremental equivalence — same query, both modes, provably identical results
- Temporal snapshots — AS OF queries for any point in time
- Lineage — every state change traceable to evidence sources
5. Downstream actions#
Framing, zoom, replay, analytics, operator prompts, robotic PTZ control, API consumers — all read from the same grounded neural object. No separate data models per consumer.
Use case: Ball + nearest 2 players through camera handoff#
The smallest viable experiment:
Build#
const db = open('./venue-graph')
// Ingest detector/tracker/fusion outputs
function ingestDetection(det: Detection) {
db.mutate(`
MERGE (obj:${det.class} {gnoId: $gnoId})
SET obj.x = $x
SET obj.y = $y
SET obj.z = $z
SET obj.speed = $speed
SET obj.confidence_position = $confPos
SET obj.occlusionState = $occState
`, {
gnoId: det.gnoId,
x: det.x, y: det.y, z: det.z,
speed: det.speed,
confPos: det.confidence,
occState: det.occluded ? 'occluded' : 'visible'
})
}
// Maintain live relation: nearest players to ball
db.mutate(`CREATE LIVE VIEW ball_proximity AS
MATCH (p:Player) MATCH (b:Ball)
WHERE p.confidence_position > 0.7
RETURN p.gnoId, p.name, p.x, p.y, b.x AS bx, b.y AS by
ORDER BY p.confidence_position DESC
LIMIT 3
`)
// Drive framing from the same object
function getFramingTarget() {
return db.query("MATCH (b:Ball) RETURN b.x, b.y, b.confidence_position")
}Measure#
- ID switch rate (how often
gnoIdincorrectly changes) - Occlusion reacquisition latency (ms to recover after occlusion)
- Tracking ↔ framing disagreement rate
- Batch vs delta divergence (should be zero)
- Operator override rate
Applicable domains#
| Domain | Physical objects | Key challenge |
|---|---|---|
| Sports analytics | Players, ball, referee, formations | Camera handoff, occlusion, sub-second control |
| Fleet management | Vehicles, warehouses, delivery zones | GPS drift, tunnel loss, geofence transitions |
| IoT / robotics | Sensors, robots, equipment, zones | Sensor failure, partial observability |
| Security / CCTV | People, vehicles, restricted areas | Multi-camera identity, dwell detection |
| Gaming / simulation | NPCs, items, terrain, projectiles | State synchronization, spatial awareness |
Key insight#
Neural world models are generative prediction engines — trained on petabytes of sensor data to simulate how the world evolves under actions. They are extraordinary at anticipating futures. They are not a database. They do not remember that gno-player-10 was at (52.3, 34.1) at tick 4821 with identity confidence 0.92, or that the ReID signature changed during a camera handoff.
ArcFlow is the operational complement. Neural outputs arrive as detections; ArcFlow lifts them into persistent, queryable entities with spatial position, per-dimension confidence, temporal history, and live-view relations. The learned sidecar rides alongside the physical track — it does not replace it.
Operational causality — the exact sensor-to-entity-to-zone chain, stored as a graph:
-- Which sensor saw the player who entered the danger zone that triggered the shutdown?
MATCH (s:Sensor)-[d:DETECTED]->(p:Player)-[:IN_ZONE]->(z:Zone {hazard: 'restricted'})
WHERE d.confidence >= 0.88
RETURN s.name AS sensor, s.type, d.confidence, p.gnoId, p.name, z.name AS zone
ORDER BY d.confidence DESCDeterministic. Sub-millisecond. Against the exact history of every detection, every sensor, every identity assignment. The actual causal chain from sensor to entity to zone to consequence — stored as a graph, traversable instantly. This is what ArcFlow is built for.
Neural world models simulate. ArcFlow records. The GNO is the point where both layers meet.
See Also#
- Robotics & Perception — sensor fusion, track lifecycle, confidence-scored observations
- Trusted RAG — confidence-filtered retrieval for learned properties
- Confidence & Provenance — observation classes and evidence chains
- Live Queries — standing queries that update learned properties as new evidence arrives
- Autonomous Systems — fleet-level coordination on a shared world model