MIT Libraries logoDSpace@MIT

MIT
View Item 
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
  • DSpace@MIT Home
  • MIT Libraries
  • MIT Theses
  • Graduate Theses
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Human-in-the-Loop Task Directed Exploration and Planning in Unknown Environments

Author(s)
Jois, Aneesh Ramesh
Thumbnail
DownloadThesis PDF (10.64Mb)
Advisor
Williams, Brian C.
Terms of use
In Copyright - Educational Use Permitted Copyright retained by author(s) https://rightsstatements.org/page/InC-EDU/1.0/
Metadata
Show full item record
Abstract
For robots to perform everyday tasks autonomously, like humans, they should be able to perceive, explore and act in novel environments while pursuing high level goals. This capability is known as task-directed exploration, and is essential in domains ranging from household assistance robots to disaster response. However, existing approaches each fall short in fulfilling the task directed exploration problem. Classical symbolic planners require brittle, hand crafted domain models and assume complete knowledge of the environment. POMDP based formulations provide a principled approach to planning under uncertainty but are computationally intractable in large, open world settings. Foundation models such as large language models (LLMs) and vision language models (VLMs) offer strong commonsense knowledge and pattern recognition capabilities but lack the structured spatial grounding and adaptivity required for embodied execution. This thesis presents a unified framework that closes this gap by tightly integrating foundation models with a real time semantic mapping and planning stack. The system consists of four components: (i) a dual layer perception module that combines a deterministic 3D scene graph with a frontier based probabilistic belief field, using vision language models for object labeling and large language models for room classification; (ii) a symbolic task planner that converts natural language instructions into high level activity plans; (iii) an exploration executive that selects informative waypoints, monitors task progress, and dynamically triggers replanning and human queries; and (iv) a unified value of information (VoI) metric that governs both autonomous exploration and selective human interaction, enabling the robot to reason about uncertainty and task utility in a principled way. Demonstrated in realistic simulated environments, the proposed framework allows agents to ground natural language goals in their surroundings, explore efficiently, reason over partial knowledge, and adapt plans as new information is acquired, while involving the user only when doing so meaningfully improves performance.
Date issued
2025-09
URI
https://hdl.handle.net/1721.1/165129
Department
Massachusetts Institute of Technology. Department of Aeronautics and Astronautics
Publisher
Massachusetts Institute of Technology

Collections
  • Graduate Theses

Browse

All of DSpaceCommunities & CollectionsBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

My Account

Login

Statistics

OA StatisticsStatistics by CountryStatistics by Department
MIT Libraries
PrivacyPermissionsAccessibilityContact us
MIT
Content created by the MIT Libraries, CC BY-NC unless otherwise noted. Notify us about copyright concerns.