Close

MimicGen Data Augmentation Pipeline for Robotic Manipulation

A project log for ChefMate - AI LeRobot Arm

Robotic arm workflow with nVIDIA GR00T N1.5 model. Dataset recording, fine-tuning, debugging, and deployment for pick-and-place tasks

vipin-mVipin M 10/08/2025 at 22:350 Comments

Project Overview

I implemented a complete MimicGen data augmentation pipeline to generate multiple training demonstrations from a single recorded episode. The goal was to overcome the data scarcity problem in robotic manipulation by automatically creating diverse variations of expert demonstrations.

This project log documents the systematic implementation of the 4-step MimicGen workflow, from converting demonstrations to IK actions through generating 10x augmented data, and the debugging challenges encountered along the way.

The pipeline successfully transformed 1 original demonstration into 10 augmented demonstrations with a 71.4% generation success rate, providing rich training data for imitation learning policies.

Hardware Setup

The Problem: Data Scarcity in Robotic Learning

Initial Challenge

Robotic manipulation policies require large amounts of diverse training data, but collecting demonstrations is:

Traditional approach: Record 50-100 demonstrations manually.
MimicGen approach: Record 1 demonstration → Generate 10+ variations automatically.

MimicGen Pipeline Overview

The 4-Step Workflow

  1. Convert to IK Actions: Transform joint-space actions (6D) to end-effector actions (8D)
  2. Annotate Subtasks: Automatically detect subtask boundaries using termination signals
  3. Generate Augmented Data: Create variations by recombining subtask segments
  4. Convert to Joint Actions: Transform back to joint-space for training

Task Structure: Lift Cube

Subtask 1pick_cube - Approach and grasp the cube
Subtask 2lift_cube - Lift cube above threshold height

Key Requirements:

Debugging Approach

Step 1: Environment Configuration Issues

Problem: MimicGen annotation failed with “The final task was not completed” error.

Root Cause Analysis:

Solution: Added lift_cube observation function:

def lift_cube(        env: ManagerBasedRLEnv,        cube_cfg: SceneEntityCfg = SceneEntityCfg("cube"),        robot_cfg: SceneEntityCfg = SceneEntityCfg("robot"),        robot_base_name: str = "base",        height_threshold: float = 0.05) -> torch.Tensor:    """Check if the cube is lifted above the robot base."""    cube: RigidObject = env.scene[cube_cfg.name]    robot: Articulation = env.scene[robot_cfg.name]    cube_height = cube.data.root_pos_w[:, 2]    base_index = robot.data.body_names.index(robot_base_name)    robot_base_height = robot.data.body_pos_w[:, base_index, 2]    above_base = cube_height - robot_base_height > height_threshold    return above_base

Step 2: Height Threshold Calibration

Critical Discovery: The default height threshold (0.20m) was too strict for the actual cube size.

Investigation Process:

  1. Examined cube model file: /assets/scenes/table_with_cube/cube/model.xml
  2. Found actual dimensions: 0.015077m × 0.015077m × 0.015077m (1.5cm cube)
  3. Calculated appropriate threshold: 0.05m (3.3× cube height)

Configuration Update:

# Updated threshold in both environments
height_threshold: float = 0.05  # Changed from 0.20m

Step 3: MimicGen Configuration Requirements

Problem: Assertion error during generation: “assert subtask_configs[-1].subtask_term_offset_range[0] == 0”

Root Cause: Final subtask had incorrect offset range configuration.

MimicGen Requirements:

Solution:

# Final subtask configuration
subtask_configs.append(    SubTaskConfig(        object_ref="cube",        subtask_term_signal=None,  # No termination signal for final subtask        subtask_term_offset_range=(0, 0),  # Required by MimicGen        selecti,        description="Lift cube",        next_subtask_description=None,    )
)

Critical Discovery: Environment Compatibility

The AttributeError Challenge

ProblemAttributeError: 'ManagerBasedRLLeIsaacMimicEnv' object has no attribute 'scene'

Root Cause: MimicGen environment has different internal structure than regular environment.

Solution: Added compatibility handling in termination functions:

# Handle both regular env and mimic env
if hasattr(env, 'scene'):    num_envs = env.num_envs    device = env.device    cube: RigidObject = env.scene[cube_cfg.name]    robot: Articulation = env.scene[robot_cfg.name]
else:    # For mimic environments, try alternative access patterns    num_envs = getattr(env, '_num_envs', 1)    device = getattr(env, '_device', torch.device('cuda'))    scene = getattr(env, '_scene', None) or getattr(env, 'env', None)    if scene is None:        return torch.tensor([True], dtype=torch.bool, device=device)    cube: RigidObject = scene[cube_cfg.name]    robot: Articulation = scene[robot_cfg.name]

The Solution: Complete Pipeline Implementation

Step 1: Convert to IK Actions

/home/vipin/IsaacSim/_build/linux-x86_64/release/python.sh scripts/mimic/eef_action_process.py \    --input_file=./datasets/so101_lift_cube.hdf5 \    --output_file=./datasets/processed_so101_lift_cube.hdf5 \    --to_ik \    --device=cuda \    --headless

Result: 6D joint actions → 8D end-effector actions (7 pose + 1 gripper)

Step 2: Annotate Subtasks

/home/vipin/IsaacSim/_build/linux-x86_64/release/python.sh scripts/mimic/annotate_dataset.py \    --input_file=./datasets/processed_so101_lift_cube.hdf5 \    --output_file=./datasets/annotated_so101_lift_cube.hdf5 \    --task=LeIsaac-SO101-LiftCube-Mimic-v0 \    --device=cuda \    --headless

Result: Added subtask termination signals for automatic segmentation.

Step 3: Generate Augmented Data

/home/vipin/IsaacSim/_build/linux-x86_64/release/python.sh scripts/mimic/generate_dataset.py \    --input_file=./datasets/annotated_so101_lift_cube.hdf5 \    --output_file=./datasets/generated_so101_lift_cube.hdf5 \    --task=LeIsaac-SO101-LiftCube-Mimic-v0 \    --num_demos=10 \    --device=cuda \    --headless

Result: 1 demonstration → 10 augmented demonstrations (71.4% success rate)

Step 4: Convert Back to Joint Actions

/home/vipin/IsaacSim/_build/linux-x86_64/release/python.sh scripts/mimic/eef_action_process.py \    --input_file=./datasets/generated_so101_lift_cube.hdf5 \    --output_file=./datasets/final_so101_lift_cube.hdf5 \    --to_joint \    --device=cuda \    --headless

Result: 8D end-effector actions → 6D joint actions ready for training.

Technical Implementation Details

Dataset Pipeline Summary

StageFileSizeEpisodesAction DimDescription
Originalso101_lift_cube.hdf5239.5 MB16DRecorded demonstration
IK Convertedprocessed_so101_lift_cube.hdf574.2 MB18DEnd-effector actions
Annotatedannotated_so101_lift_cube.hdf574.4 MB18DWith subtask signals
Generatedgenerated_so101_lift_cube.hdf5732.5 MB108DAugmented data
Finalfinal_so101_lift_cube.hdf5732.5 MB106DReady for training

LeRobot Conversion

conda activate lerobot
python scripts/convert/isaaclab2lerobot.py

Configuration:

repo_id = 'sparkmt/so101_lift_cube_mimicgen'
robot_type = 'so101_follower'
fps = 30
task = 'Lift cube from table using MimicGen augmented data'

Result: 21 MB LeRobot v3.0 dataset with AV1-compressed videos.

Results and Validation

Data Augmentation Success

Generation Statistics

MimicGen Generation Results:
- Total attempts: 14
- Successful generations: 10
- Failed generations: 4
- Success rate: 71.4%
- Average generation time: ~2 minutes per demo

Dataset Quality Metrics

LeRobot Dataset Structure

sparkmt/so101_lift_cube_mimicgen/
├── data/           (372K) - Parquet files with actions/states
├── videos/         (20M)  - AV1-compressed dual camera videos
├── meta/           (84K)  - Dataset metadata and statistics
└── images/         (12K)  - Sample images

Compression Efficiency: 732.5 MB HDF5 → 21 MB LeRobot (99% size reduction)

Technical Insights

1. Subtask Configuration is Critical

2. Environment Compatibility Matters

3. Action Space Consistency

4. Success Rate Expectations

Current Status

Summary

This project successfully implemented a complete MimicGen data augmentation pipeline, transforming a single demonstration into 10 diverse training examples. The systematic debugging approach revealed critical requirements for MimicGen configuration, including proper subtask termination signals, accurate height thresholds, and environment compatibility handling.

The pipeline achieved a 71.4% generation success rate and produced a high-quality dataset with 10× data augmentation. The final LeRobot dataset provides rich training data with dual camera observations and diverse manipulation trajectories, all compressed efficiently using modern video codecs.

Key technical contributions include environment compatibility fixes, proper subtask configuration, and automated conversion between action spaces. The debugging infrastructure and systematic approach provide a foundation for scaling to more complex manipulation tasks.

The project demonstrates the power of automated data augmentation for robotic learning, reducing the manual data collection burden while increasing training data diversity and quality.

Next: Training imitation learning policies on the augmented dataset and comparing performance against single-demonstration baselines.

Framework: Isaac Sim 5.0 + Isaac Lab + MimicGen
Data Pipeline: HDF5 → LeRobot v3.0 (21 MB, AV1-compressed)
Hardware: SO-101 Robot Arm, RTX 4080 Super
Success Metrics: 10× data augmentation, 71.4% generation success rate

Discussions