Multi-source observation with confidence

If you want	Use
Naive mean (every row equal)	`avg(x)`
Trust-weighted mean	`avg_conf(x, conf)`
Trust-weighted total	`sum_conf(x, conf)`
How many high-confidence rows?	`count_conf(conf, threshold)`
Min / max among trusted rows	`min_conf(x, conf, t)` / `max_conf(x, conf, t)`

// Vendor A reports the play at high confidence:
CREATE (:Observation {
  play_id: 4711,
  speed_mph: 22.4,
  _confidence: 0.95,
  _observation_class: 'vendor_a'
})
 
// Vendor B reports the same play with lower confidence:
CREATE (:Observation {
  play_id: 4711,
  speed_mph: 23.1,
  _confidence: 0.7,
  _observation_class: 'vendor_b'
})
 
// A noisy ML estimate adds another low-confidence row:
CREATE (:Observation {
  play_id: 4711,
  speed_mph: 19.0,
  _confidence: 0.3,
  _observation_class: 'ml_v3'
})

MATCH (o:Observation)
WHERE o.play_id = 4711
RETURN avg(o.speed_mph)                       AS naive,
       avg_conf(o.speed_mph, o._confidence)   AS weighted

MATCH (o:Observation)
WHERE o.play_id = 4711
RETURN count_conf(o._confidence, 0.5)              AS trusted_count,
       min_conf(o.speed_mph, o._confidence, 0.5)   AS min_speed,
       max_conf(o.speed_mph, o._confidence, 0.5)   AS max_speed

MATCH (o:Observation)
WHERE o.play_id = 4711
WITH o._observation_class AS source,
     avg_conf(o.speed_mph, o._confidence) AS speed,
     count(o)                              AS n
RETURN source, speed, n
ORDER BY speed DESC

Multi-source observation with confidence

Modelling#

Querying the consensus#

Filtering by confidence threshold#

Per-source breakdown#

When to use which aggregate#

Anti-pattern#

Multi-source observation with confidence

Modelling#

Querying the consensus#

Filtering by confidence threshold#

Per-source breakdown#

When to use which aggregate#

Anti-pattern#