ArcFlow
Company
Managed Services
Markets
  • News
  • LOG IN
  • GET STARTED

OZ brings Visual Intelligence to physical venues, a managed edge layer that lets real-world environments see, understand, and act in real time.

Talk to us

ArcFlow

  • World Models
  • Sensors

Managed Services

  • OZ VI Venue 1
  • Case Studies

Markets

  • Sports
  • Broadcasting
  • Robotics

Company

  • About
  • Technology
  • Careers
  • Contact

Ready to see it live?

Talk to the OZ team about deploying at your venues, from a single pilot match to a full regional rollout.

Schedule a deployment review

© 2026 OZ. All rights reserved.

LinkedIn
  1. Home
  2. Blog
  3. Edge-First AI: Why Sub-Second Latency Is the Only Metric That Matters

Edge-First AI: Why Sub-Second Latency Is the Only Metric That Matters

Article March 4, 2026

Share this post

In most AI products, the model is the product. At OZ, latency is the product.

When a ball changes direction, when a person enters a restricted zone, when a camera needs to track an unpredictable trajectory, the system has milliseconds to respond. Not seconds. Not "near real-time." Milliseconds.

That constraint defines the architecture.

The non-negotiable boundary#

OZ splits its computing stack along one clear boundary: what must happen in real time runs on the venue edge. Everything else runs in the cloud.

On-venue edge (time-critical):

  • Perception: multi-camera spatial tracking, entity detection, scene understanding
  • Cueing: AI-driven camera directives for robotic capture heads
  • Control: deterministic control loops governing capture priorities and zone policies
  • Spatial output: structured data delivered to downstream systems via the Venue Graph
  • Recovery: self-healing capture loops that restart failed components without external intervention

Cloud (improvement-critical):

  • Model training: new perception models trained on aggregated, anonymized telemetry
  • Fleet analytics: cross-venue performance comparison and trend detection
  • Playbook optimization: automated refinement of commissioning and operating procedures
  • Reporting: long-term dashboards, SLO compliance history, and capacity planning

This is not a caching strategy or a performance optimization. It is a product architecture decision. The venue operates autonomously at the edge. The cloud makes the next deployment better.

Why cloud-first AI fails for physical operations#

Cloud-first AI architectures assume three things:

  1. Network connectivity is reliable
  2. Round-trip latency is acceptable
  3. The processing window is flexible

In physical venue operations, all three assumptions fail:

Network is not reliable. Venues are physical environments. Construction, weather, crowd density, and infrastructure age all affect connectivity. A venue that depends on cloud inference stops working when the network degrades.

Round-trip latency is not acceptable. A cloud inference call adds 50-200ms of network latency on top of processing time. For a robotic camera that needs to track a fast-moving entity, that delay means the subject has already left the frame.

The processing window is not flexible. In live operations, there is no "retry later." The moment passes. The data is stale. The capture opportunity is lost. Real-time means real-time, not "fast enough most of the time."

The timing chain#

At OZ, we measure and publish the full timing chain from photon capture to spatial output:

  1. Photon to sensor: light hits the sensor array
  2. Sensor to perception: raw frames processed by edge GPU
  3. Perception to fusion: multi-camera signals combined into spatial state
  4. Fusion to cueing: AI generates camera directives
  5. Cueing to execution: robotic capture heads respond
  6. State to Venue Graph: structured spatial output delivered via API

The published target: p99 latency ≤120ms end-to-end. Measured per venue, published per deployment.

This timing chain is the product specification. Not a benchmark. Not an aspiration. A contractual commitment.

Network outage resilience#

The edge-first architecture provides a critical operational guarantee: a network outage does not interrupt the spatial layer.

When connectivity to the cloud drops:

  • Perception continues: all models run on local GPU
  • Cueing continues: camera directives execute from local state
  • Control continues: policies and priorities enforce from cached configuration
  • Spatial output continues: downstream systems on the venue network receive uninterrupted data
  • Telemetry buffers: operational data queues locally and syncs when connectivity returns

Zero data loss. Zero operational interruption. The venue does not know the cloud is unreachable.

What the cloud contributes#

The cloud isn't irrelevant; it's where the system improves:

Model improvement: Every venue generates edge cases that the current models handle imperfectly. Training pipelines aggregate anonymized data across the network to produce better models that deploy to all venues simultaneously.

Playbook refinement: Commissioning telemetry from every deployment feeds the operational playbook. The cloud analyzes patterns (which steps take longest, which environments cause calibration drift, which failure modes recur) and updates the procedures.

Fleet intelligence: Cross-venue comparison reveals performance outliers. If one venue consistently achieves lower MTTR, the cloud identifies the configuration difference and propagates it.

The cloud makes the network smarter over time. The edge makes each venue reliable right now.

Infrastructure, not software#

The edge-first architecture is why OZ is infrastructure, not software. Software runs in someone else's compute environment. Infrastructure runs in the physical environment where the work happens.

When your AI processes at the edge, you control the full execution path. When your AI processes in the cloud, you control a request-response cycle.

That is the difference between a product that observes and a system that executes.