HIW500: Humanoids In-the-Wild Dataset

HIW500 - Humanoids In-the-Wild Dataset

An in-the-wild dataset
for whole-body humanoid robot learning.

Built for learning from real homes, not lab-only scenes.

HIW500: Humanoids In-the-Wild Dataset focuses on task execution in natural environments where layout, object state, lighting, clutter, and human operating style vary from episode to episode.

The dataset is collected in real homes in Southeast Asia through human whole-body teleoperation of Unitree G1, providing demonstrations for mobile manipulation, bimanual interaction, and long-horizon household skills.

0+ Hours
0+ Episodes
0 Data
0+ Tasks
0 Homes

Acquisition setup

The hardware stack uses grippers for household manipulation, with a head camera and wrist-mounted cameras for visual observations.

Head-mounted stereo camera setup on the humanoid robot
Hand-mounted IR stereo camera and gripper setup
01

Gripper manipulation

End-effector grippers are used for household grasping, placing, and object interaction.

02

Head camera

Head-mounted camera provides a wider scene view for task context and navigation.

03

Hand stereo IR cameras

Wrist-mounted stereo IR cameras capture close-range observations around the robot hands.

What data is collected?

Each episode records human whole-body teleoperation of Unitree G1 in real homes, combining camera streams, robot states, action traces, and language annotations.

Head camera

  • RGB
  • Stereo
  • 480P
  • 30 FPS

Wrist camera

  • RGB
  • IR
  • Stereo
  • 480P
  • 30 FPS

State and actions

  • 29-DoF Joints
  • End-Effector
  • IMU
  • Odometry

Metadata

  • Language annotation
  • Episode info
  • Camera intrinsics & extrinsics

Dataset Statistics

Task Coverage

  • Building children table
  • Hang hanger
  • Clean up the room
  • Setting the table
  • Restocking fridge
  • Kitchen organization
  • Hang keys on a hook
  • Move pillow to sofa
  • Sweep floor
  • Picking trash
  • Clothes washing

Task Episodes

Average Duration

Subtask Labels

161 Total subtask labels
148k+ Subtask annotation
Top 40 Shown in treemap

Every robot subtask is annotated with a fine-grained action label. Hover tiles for details.

Open Source timeline

Release V1: 500+ hours of data

Release more tasks and environments

We host the dataset on Hugging Face in two formats: raw ROS bag / MCAP recordings and a version in LeRobot format for robot learning workflows.

@misc{hiw500_2026,
  title={HIW500: Humanoids In-the-Wild Dataset for Robot Learning},
  author={BitRobot and Unitree and Hugging Face},
  year={2026},
  howpublished={\url{https://bitrobot-foundation.github.io/humanoids-in-the-wild-500-hours/}}
}

Need more coverage, commercial rights, or custom data collection?
Request Data Access