Engineering & AI · senior
Senior ML Engineer
About the Role#
OZ is building the Physical-AI infrastructure layer for every physical environment. Our Visual Intelligence platform perceives, reasons, and acts in real time, and every model runs at the edge, on-venue, with zero cloud dependency.
As a Senior ML Engineer, you will design and optimize the inference models that power our OZ VI Venue deployments. Your models will run on NVIDIA GPUs inside OZ PanoNode enclosures mounted in stadiums and arenas worldwide. Latency budgets are measured in milliseconds. Accuracy is measured against published SLOs. There is no retry loop and no cloud fallback. Your models must be correct, fast, and deterministic.
This is AI 3.0: Physical AI. While most ML teams optimize models in cloud environments, you will ship inference that runs deterministically at the physical edge. Every model you build runs in production at venues across multiple continents. You will see your work operating live within weeks of deployment. AI tools accelerate your workflow (from experiment tracking to model documentation) and the deployment flywheel means each venue's calibration data compounds into fleet-wide improvements.
What You Will Do#
- Design, train, and optimize computer vision models for real-time edge inference on NVIDIA GPUs
- Compress and quantize models for deployment on TensorRT and ONNX runtimes within strict latency budgets
- Build and maintain the inference pipeline that feeds the Venue Graph with spatial detections, classifications, and tracking data
- Collaborate with robotics engineers to close the loop between perception and PTZ camera actuation
- Establish model evaluation frameworks with deterministic benchmarks tied to published SLOs
- Contribute to the deployment flywheel; every venue deployment generates calibration data that improves model performance fleet-wide
What We Look For#
- 5+ years of experience in ML engineering with a focus on computer vision and real-time inference
- Deep proficiency in PyTorch, with production experience in model optimization (quantization, pruning, distillation)
- Strong C++ and CUDA skills; you can profile GPU kernels and write custom operators when framework tooling falls short
- Production experience deploying models on TensorRT, ONNX Runtime, or equivalent edge inference frameworks
- Track record of shipping ML systems that operate under real-time constraints (sub-100ms end-to-end)
- Solid understanding of object detection, tracking, and pose estimation architectures
Nice to Have#
- Experience with multi-camera calibration and spatial reasoning across overlapping fields of view
- Background in sports analytics, broadcast production, or venue technology
- Familiarity with NVIDIA DeepStream or similar video analytics pipelines
- Experience with model serving at scale across a fleet of heterogeneous edge devices
- Contributions to open-source ML/CV projects
What We Offer#
- Early-stage equity (ESOP), because you build the company and you own part of it
- Hardware budget for your development setup, GPUs included
- Annual learning budget for courses, certifications, and conferences
- Remote-flexible work with async-first culture across Silicon Valley, Reykjavík, Nagpur, and Dubai
- Direct impact: your models run live in stadiums within weeks of deployment
- A growing, high-impact team where your contributions are visible and your scope expands with the company