Computer Vision Engineer (Detection, Tracking & 2D Metric Calibration Specialist)
Project Context
CrackCoach is an AI platform for automatic analysis of show-jumping videos.
This role builds the IMAGE-level perception and geometry stack that everything depends on: detection, tracking, obstacle understanding, jump segmentation, and metric calibration in real-world competition footage.
Without a rock-solid perception and geometric foundation, pose estimation, biomechanics, and AI coaching are not reliable.
⸻
Core Mission and Responsibilities
You will design, implement, and validate a production-grade computer vision pipeline capable of ingesting raw competition videos and producing robust, structured, and metric-aware outputs.
Your responsibilities include:
• Video ingestion and preprocessing: handle codecs, resolutions, FPS, orientation, stabilization, and cropping policies.
• Horse-and-rider detection using state-of-the-art detectors (YOLO / RT-DETR / Detectron2 or equivalent).
• Persistent tracking across frames (ByteTrack, BoT-SORT, DeepSORT, Kalman-based trackers).
• Obstacle detection and scene understanding for show-jumping arenas (rails, poles, standards).
• Obstacle-to-jump association logic: correctly identify which obstacle is being jumped and when.
• Automatic segmentation of a full round video into individual jump clips (per-obstacle segments).
• 2D trajectory reconstruction of the horse in image space with stable, low-jitter trajectories.
2D Metric Calibration (Image → Ground Plane)
In addition to perception, this role includes implementing a robust 2D metric calibration module:
• Estimate a ground-plane homography (image → ground) using stable scene references such as obstacle bases or other ground contact points.
• Compute a pixel-to-meter scale, ideally leveraging known or user-declared obstacle heights (e.g. “course at 1.35m”) when available.
• Project horse trajectories from image space to ground-plane coordinates in meters.
• Enable metric estimates such as:
• approach speed (m/s)
• distances between obstacles (m)
• take-off and landing distances at ground level (m)
• approximate stride length at ground level (when combined later with biomechanics)
• Provide a calibration confidence indicator and gracefully fall back to relative (pixel-based) measures when calibration is unreliable.
The calibration module must be robust, non-blocking, and designed for real-world competition footage (single camera, uncontrolled viewpoints).
⸻
Required Technical Skills
• Strong background in computer vision applied to video (sports footage experience is a strong plus).
• Proven experience with object detection (YOLO family, Detectron2, RT-DETR, etc.).
• Multi-object tracking expertise (ByteTrack / BoT-SORT / DeepSORT; handling occlusions and ID switches).
• Experience with segmentation models (Mask R-CNN, YOLO-Seg, SAM-family) if needed for background removal.
• Solid understanding of image-space geometry and camera perspective limitations.
• Experience implementing 2D metric calibration using planar homography and RANSAC.
• Comfortable working with pixel-to-meter conversions and expressing metric uncertainty.
• Advanced Python and OpenCV; deep learning framework (PyTorch preferred).
• Experience building modular, maintainable pipelines with clear interfaces and exports.
⸻
Key Technical Challenges
• Highly variable camera angles, zoom levels, and lighting conditions.
• Dynamic occlusions from obstacles, rails, other horses, and spectators.
• Motion blur and compression artifacts in user-generated videos.
• Background clutter and false positives (banners, rails, similar shapes).
• Maintaining stable trajectories despite noisy detections and temporary misses.
• Correct obstacle differentiation and obstacle association in multi-obstacle scenes.
• Metric calibration with a single camera, limited scene control, and partial reference data.
• Performance constraints: processing HD videos in minutes, not hours.
⸻
Expected Deliverables
• A fully modular computer vision pipeline (source code) that ingests raw video and outputs:
• detections
• tracks
• obstacle detections
• jump segments
• 2D trajectories
• ground-plane metric projections (when calibration is reliable)
• A 2D calibration module producing pixel-to-meter scale, ground-plane mapping, and confidence scores.
• Trained detection/segmentation models (weights + training scripts) when custom training is required.
• Clean data exports (JSON / CSV) and stable ROI frame exports for pose estimation and biomechanics.
• Visual validation outputs (overlays showing boxes, tracks, obstacles, jump boundaries, and metric projections).
• Clear technical documentation defining interfaces and data formats for downstream pose estimation, biomechanics, and AI coaching stages.
⸻
Important Notes
• This role does NOT include pose estimation or biomechanics (handled by separate specialists).
• Metric calibration is 2D ground-plane based, not full 3D reconstruction.
• Robustness and graceful degradation are more important than theoretical precision.
Apply tot his job
Apply To this Job