Building small domain-specific masked language models vs. large generative models for clinical decision support and their effects on users.
Author(s)
Sergeeva, Elena
DownloadThesis PDF (2.247Mb)
Advisor
Szolovits, Peter
Terms of use
Metadata
Show full item recordAbstract
The frequently adopted definition of knowledge defines it as “justified true belief”. As one may notice this definition presents some issues when applied to AI: it is unclear to which degree it is justified to use “humanizing” vocabulary like “belief” or “justification” when describing the performance of an AI system. Traditional explicit knowledge-representation based AI involves reasoning over symbolic representation of statements standing for such “justified true beliefs” [1], the modern connectionist methodology however replaces explicit reasoning with making a prediction based on a set of computations done over weighted continuous representations of the inputs. The continuous representations learned by such systems remain “black box-like”, where the only elements directly understandable by the human user are the model inputs and outputs. In the first part of this thesis I introduce a set of Masked-Language model transformer based models for a diverse set of medical natural language processing tasks including Named Entity Recognition, Negation Extraction and Relation extraction that perform as well or better than bigger prompt-and-generate transformer-based causal language models. In the second part of the thesis, I discuss the modern “prompt-and-generate” approach to natural language processing where both the inputs and the outputs of the model are word-like elements commonly referred to as “tokens”. I explore the nature of token based representation of the input and look at the way token “meaning” is refined at each layer of the successive transformer computation. With respect to the outputs, I explore how people engage with AI generated sequences of tokens that people perceive as “explained” predictions.
Date issued
2025-05Department
Massachusetts Institute of Technology. Department of Electrical Engineering and Computer SciencePublisher
Massachusetts Institute of Technology