Towards Scalable Robot Learning without Physical Robots

Park, Younghyo

Author(s)

Park, Younghyo

DownloadThesis PDF (13.41Mb)

Advisor

Agrawal, Pulkit

Terms of use

In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/

Metadata

Show full item record

Abstract

The development of generalist robots—capable of performing a wide range of tasks in diverse environments—requires large-scale datasets of robot interactions. Unlike language or vision domains, where data can be passively collected at scale, robotic data collection remains costly, labor-intensive, and constrained by physical hardware. This thesis explores two complementary directions to overcome this challenge. First, we examine the limitations of training robots from scratch using reinforcement learning (RL). While RL has achieved promising results in simulation, its scalability is hindered by a largely overlooked bottleneck: environment shaping. Designing suitable rewards, action and observation spaces, and task dynamics typically requires extensive human intervention. We formalize environment shaping as a critical optimization problem and introduce tools and benchmarks to study and eventually automate this process, a necessary step toward general-purpose RL. Second, we introduce an alternative paradigm for robot data collection that does not rely on real-world robots. Using the Apple Vision Pro, we develop DART, an augmented reality (AR) teleoperation platform that streams human hand motions to cloud-hosted robot simulations. This setup enables scalable, low-latency collection of high-quality robot demonstrations without the overhead of physical setup or maintenance. Our user studies show that DART more than doubles data collection throughput while reducing operator fatigue, and policies trained in simulation using this data successfully transfer to the real world. Together, these contributions address two key bottlenecks in robot learning: the human effort required for RL environment design, and the dependence on physical robots for data. They lay the groundwork for scalable, accessible approaches to training generalist robot models.

Date issued

2025-05

URI

https://hdl.handle.net/1721.1/163708

Department

Massachusetts Institute of Technology. Department of Electrical Engineering and Computer Science

Publisher

Massachusetts Institute of Technology

Collections

Graduate Theses