How to Write a Robotics Data Collection Spec
A vague brief produces unusable data. This template shows how to specify a robotics capture so you get exactly the episodes your policy needs.
TL;DR. A robotics data collection spec should pin down the task, the environments and objects, the camera viewpoint, the signals and annotations, the output format, the success criterion, and the target volume in usable hours. Specifying "what correct looks like" is the single most important line - it determines whether the data is usable.
The spec template
- Task and sub-tasks. Exactly what is performed (e.g. "terminate and dress a 12-way electrical panel").
- Success criterion. What counts as a correct, complete episode. Be unambiguous.
- Environments. How many distinct settings, and how varied (lighting, clutter, layout).
- Objects. The set and variation - generalisation tracks object/environment diversity per imitation-learning scaling laws.
- Viewpoint and sensors. Egocentric? Depth? Hand pose? Reference angle?
- Annotations. Action segmentation, success flags, narration, pose.
- Format. LeRobot, RLDS or HDF5 - see format guide.
- Volume. Target usable hours and episode count, plus splits.
- Compliance. Consent, redaction, DPA, jurisdiction.
Why "what correct means" comes first
Without a crisp success criterion, annotators guess and your success/failure labels are inconsistent - which quietly poisons training. Borrow the rigour of published dataset documentation such as DROID and Open X-Embodiment data cards.
Common mistakes
- Specifying raw hours instead of usable hours.
- Asking for one environment when diversity drives generalisation.
- Forgetting calibration and control-frequency metadata.
- Leaving consent and redaction as an afterthought.
Validate before you scale
Turn the spec into a small paid test kit first, check quality and IAA, then scale - the workflow in how to buy robotics training data.
FAQ
What goes into a robotics data collection spec? Task and sub-tasks, a success criterion, environments and objects, viewpoint and sensors, annotations, output format, target usable hours, and compliance requirements.
What is the most important part of a data spec? The success criterion - "what correct means". It determines label consistency and whether the resulting data is trainable.
Should I specify raw or usable hours? Usable hours. Raw footage includes material removed by redaction and QA; usable hours reflect what reaches your training set.
Send us a spec (or a sketch) and we'll scope it: request a Test Kit or talk to us.
Physical-AI data specialists at OFORO LTD (UK). We write about egocentric data, robotics dataset formats, RLHF and data governance. See what we build.