Engineering & AI · senior

Senior ML Engineer

Nagpur, India / RemoteHybridPosted January 15, 2026

PyTorchC++CUDATensorRTONNXComputer VisionEdge InferenceModel OptimizationPythonReal-Time Systems

About the Role#

OZ is building the Physical-AI infrastructure layer for every physical environment. Our Visual Intelligence platform perceives, reasons, and acts in real time, and every model runs at the edge, on-venue, with zero cloud dependency.

As a Senior ML Engineer, you will design and optimize the inference models that power our OZ VI Venue deployments. Your models will run on NVIDIA GPUs inside OZ PanoNode enclosures mounted in stadiums and arenas worldwide. Latency budgets are measured in milliseconds. Accuracy is measured against published SLOs. There is no retry loop and no cloud fallback. Your models must be correct, fast, and deterministic.

This is AI 3.0: Physical AI. While most ML teams optimize models in cloud environments, you will ship inference that runs deterministically at the physical edge. Every model you build runs in production at venues across multiple continents. You will see your work operating live within weeks of deployment. AI tools accelerate your workflow (from experiment tracking to model documentation) and the deployment flywheel means each venue's calibration data compounds into fleet-wide improvements.

What You Will Do#

Design, train, and optimize computer vision models for real-time edge inference on NVIDIA GPUs
Compress and quantize models for deployment on TensorRT and ONNX runtimes within strict latency budgets
Build and maintain the inference pipeline that feeds the Venue Graph with spatial detections, classifications, and tracking data
Collaborate with robotics engineers to close the loop between perception and PTZ camera actuation
Establish model evaluation frameworks with deterministic benchmarks tied to published SLOs
Contribute to the deployment flywheel; every venue deployment generates calibration data that improves model performance fleet-wide

What We Look For#

5+ years of experience in ML engineering with a focus on computer vision and real-time inference
Deep proficiency in PyTorch, with production experience in model optimization (quantization, pruning, distillation)
Strong C++ and CUDA skills; you can profile GPU kernels and write custom operators when framework tooling falls short
Production experience deploying models on TensorRT, ONNX Runtime, or equivalent edge inference frameworks
Track record of shipping ML systems that operate under real-time constraints (sub-100ms end-to-end)
Solid understanding of object detection, tracking, and pose estimation architectures

Nice to Have#

Experience with multi-camera calibration and spatial reasoning across overlapping fields of view
Background in sports analytics, broadcast production, or venue technology
Familiarity with NVIDIA DeepStream or similar video analytics pipelines
Experience with model serving at scale across a fleet of heterogeneous edge devices
Contributions to open-source ML/CV projects

What We Offer#

Early-stage equity (ESOP), because you build the company and you own part of it
Hardware budget for your development setup, GPUs included
Annual learning budget for courses, certifications, and conferences
Remote-flexible work with async-first culture across Silicon Valley, Reykjavík, Nagpur, and Dubai
Direct impact: your models run live in stadiums within weeks of deployment
A growing, high-impact team where your contributions are visible and your scope expands with the company

Apply for this Role