Errors
Every ArcFlow error has 5 fields: class, code, message, failing_field, and recovery_suggestion. Agents can parse class + code to decide recovery strategy without reading the message.
TypedError {
class: ErrorClass // category for routing
code: String // e.g., "INVALID_LABEL"
message: String // human-readable
failing_field: String? // which input was wrong
recovery_suggestion: String? // what to do about it
}
Error Classes#
| Class | Meaning | Whose fault |
|---|---|---|
Validation | Input validation failure | Caller |
Integration | Internal engine bug or misconfiguration | Engine |
Architecture | Architectural constraint violation | Design |
Timeout | Operation timed out | Environment |
Error Codes#
| Code | Category | Cause | Fix |
|---|---|---|---|
EXPECTED_KEYWORD | parse | Query syntax error | Check MATCH / CREATE / MERGE syntax |
UNEXPECTED_TOKEN | parse | Unexpected token in query | Check clause order and punctuation |
UNKNOWN_FUNCTION | validation | Function name not recognised | Run CALL db.help() for full list |
UNKNOWN_PROCEDURE | validation | CALL target not found | Run CALL db.procedures() |
UNKNOWN_LABEL | validation | Label doesn't exist in schema | Check CALL db.schema() |
INVALID_PARAMETER | validation | Parameter type mismatch | Ensure param types match QueryParams |
MISSING_PARAMETER | validation | $param used but not supplied | Pass all parameters in the params object |
DB_CLOSED | integration | Query after db.close() | Don't query a closed database |
LOCK_POISONED | integration | Write lock poisoned by panic | Restart the database process |
COMPILE_ERR | parse | Query failed to compile | Check query against GQL reference |
VECTOR_DIM_MISMATCH | validation | Query vector ≠ index dimensions | Match vector length to OPTIONS {dimensions} |
INDEX_NOT_FOUND | validation | Named index doesn't exist | Check CALL db.indexes |
CONSTRAINT_VIOLATION | validation | Unique constraint failed | Use MERGE instead of CREATE |
WORKFLOW_NOT_FOUND | validation | arcflow.workflow.* target not found | Run CALL arcflow.workflow.list |
STEP_NOT_FOUND | validation | Step name not in workflow | Check step name spelling |
EXECUTION_CONTEXT_MISMATCH | integration | requireExecutionContext guard failed | CALL db.setExecutionContext(...) first |
UNKNOWN_EXECUTION_CONTEXT | validation | Invalid context string | Use local_cpu, local_gpu, or distributed |
TEMPORAL_WAL_NOT_WIRED | integration | AS OF seq N without WAL context | Open the store with WAL enabled (open() or openInMemory() with sync options) |
Design Principle#
Every error is fail-fast and typed. No generic "something went wrong."
Agents can parse class + code to decide recovery strategy without reading
the message field.
AIOps Telemetry (opt-in)#
ArcFlow can optionally send error and performance telemetry to OZ's AIOps observability unit. This is off by default. When enabled, the engine streams structured error events and performance signals to aiops.oz.com over an encrypted channel.
Why opt in? Every error that reaches OZ gets processed by an automated diagnostics pipeline. Patterns across deployments surface issues that no single operator would catch: a GPU memory pressure trend that precedes failures, a query pattern that degrades under specific graph shapes, a clock drift that corrupts temporal indexes. The more engines report, the faster ArcFlow gets for everyone.
What gets sent#
TypedErrorevents (class, code, failing_field; never query content or graph data)- Performance counters: query latency percentiles, GPU utilization, memory pressure, WAL write throughput
- Engine metadata: version, deployment type (native, Docker, PanoNode, browser), OS, GPU model
No graph data, no query strings, no user content. The telemetry payload is structural, not semantic.
Enabling telemetry#
# Enable in CLI
arcflow --telemetry on
# Or set in config
arcflow config set telemetry.enabled trueTelemetry flows through the same sync channel that ArcFlow Cloud uses for fragment coordination. If your engine is already connected to ArcFlow Cloud, telemetry rides the existing connection. If not, it opens a dedicated encrypted channel to aiops.oz.com.
How it helps you#
OZ's AIOps team monitors telemetry across the fleet continuously. If your engine hits a pattern that's been seen (and solved) elsewhere, you benefit from that fix faster. If your deployment surfaces a new edge case, the engineering team sees it before it becomes a problem. You're not debugging alone.
You can disable telemetry at any time. The engine runs identically with or without it.
See Also#
- Error Handling — structured errors with machine-readable codes and recovery hints
- TypeScript API Reference —
db.query()anddb.mutate()error surfaces - ArcFlow for Coding Agents — how agents pattern-match on error codes and self-correct