Show simple item record

dc.contributor.advisorShun, Julian
dc.contributor.advisorCafarella, Michael J.
dc.contributor.authorZhang, Ziyu
dc.date.accessioned2025-11-25T19:37:42Z
dc.date.available2025-11-25T19:37:42Z
dc.date.issued2025-05
dc.date.submitted2025-08-14T19:34:52.541Z
dc.identifier.urihttps://hdl.handle.net/1721.1/164033
dc.description.abstractThe recent advancement of large language models (LLMs) and large multimodal models (LMMs) greatly enhances the capabilities of AI systems such as recommendation systems and coding assistants, making them more practical for real-world deployment. However, these models cannot directly interact with large volumes of data in a knowledge corpus during inference/task time due to inherent architectural limits and cost concerns. Encoding data into vector embeddings and leveraging approximate nearest neighbor search (ANNS) have thus become an important data processing primitive in AI systems following the introduction of retrievel-augmented generation (RAG). However, the complexity of tasks these AI systems aim to solve introduces challenges for existing ANNS algorithms. I developed methods to expand existing ANNS algorithms to address two such challenges: freshness and heterogeneity in the data. Graph-based ANNS algorithms have been proven to have superb cost versus approximation quality trade-off yet follow a simple intuition of best-first search. I focus on adapting graph-based ANNS algorithms to two settings featuring emerging challenges. (1) Data is updated constantly. Existing algorithms are inefficient under deletions and not robust against different orderings in the workload. I propose methods addressing these problems and developed an algorithm supporting updates effectively and efficiently based on Vamana, a state-of-the-art graph-based ANNS algorithm. (2) Data is heterogeneous in format, modality, and how they relate to a query, making the similarity difficult to capture by the canonical ANNS definition. I explore ways to model the similarity between heterogeneous sources and using graph-based ANNS approaches to perform semantic search in this setting. I test this approach under an end-to-end multimodal question-answering system developed in-house.
dc.publisherMassachusetts Institute of Technology
dc.rightsIn Copyright - Educational Use Permitted
dc.rightsCopyright retained by author(s)
dc.rights.urihttps://rightsstatements.org/page/InC-EDU/1.0/
dc.titleGraph-based Vector Search Algorithms for Retrieval-Augmented AI Systems
dc.typeThesis
dc.description.degreeS.M.
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
dc.identifier.orcid0000-0002-8605-0169
mit.thesis.degreeMaster
thesis.degree.nameMaster of Science in Electrical Engineering and Computer Science


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record