Human-in-the-Loop Task Directed Exploration and
Planning in Unknown Environments

Jois, Aneesh Ramesh

Author(s)

Jois, Aneesh Ramesh

DownloadThesis PDF (10.64Mb)

Advisor

Williams, Brian C.

Terms of use

In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/

Metadata

Show full item record

Abstract

For robots to perform everyday tasks autonomously, like humans, they should be able to perceive, explore and act in novel environments while pursuing high level goals. This capability is known as task-directed exploration, and is essential in domains ranging from household assistance robots to disaster response. However, existing approaches each fall short in fulfilling the task directed exploration problem. Classical symbolic planners require brittle, hand crafted domain models and assume complete knowledge of the environment. POMDP based formulations provide a principled approach to planning under uncertainty but are computationally intractable in large, open world settings. Foundation models such as large language models (LLMs) and vision language models (VLMs) offer strong commonsense knowledge and pattern recognition capabilities but lack the structured spatial grounding and adaptivity required for embodied execution. This thesis presents a unified framework that closes this gap by tightly integrating foundation models with a real time semantic mapping and planning stack. The system consists of four components: (i) a dual layer perception module that combines a deterministic 3D scene graph with a frontier based probabilistic belief field, using vision language models for object labeling and large language models for room classification; (ii) a symbolic task planner that converts natural language instructions into high level activity plans; (iii) an exploration executive that selects informative waypoints, monitors task progress, and dynamically triggers replanning and human queries; and (iv) a unified value of information (VoI) metric that governs both autonomous exploration and selective human interaction, enabling the robot to reason about uncertainty and task utility in a principled way. Demonstrated in realistic simulated environments, the proposed framework allows agents to ground natural language goals in their surroundings, explore efficiently, reason over partial knowledge, and adapt plans as new information is acquired, while involving the user only when doing so meaningfully improves performance.

Date issued

2025-09

URI

https://hdl.handle.net/1721.1/165129

Department

Massachusetts Institute of Technology. Department of Aeronautics and Astronautics

Publisher

Massachusetts Institute of Technology

Collections

Graduate Theses