Show simple item record

dc.contributor.advisorSzolovits, Peter
dc.contributor.authorSergeeva, Elena
dc.date.accessioned2025-12-03T16:10:33Z
dc.date.available2025-12-03T16:10:33Z
dc.date.issued2025-05
dc.date.submitted2025-08-14T19:44:01.046Z
dc.identifier.urihttps://hdl.handle.net/1721.1/164141
dc.description.abstractThe frequently adopted definition of knowledge defines it as “justified true belief”. As one may notice this definition presents some issues when applied to AI: it is unclear to which degree it is justified to use “humanizing” vocabulary like “belief” or “justification” when describing the performance of an AI system. Traditional explicit knowledge-representation based AI involves reasoning over symbolic representation of statements standing for such “justified true beliefs” [1], the modern connectionist methodology however replaces explicit reasoning with making a prediction based on a set of computations done over weighted continuous representations of the inputs. The continuous representations learned by such systems remain “black box-like”, where the only elements directly understandable by the human user are the model inputs and outputs. In the first part of this thesis I introduce a set of Masked-Language model transformer based models for a diverse set of medical natural language processing tasks including Named Entity Recognition, Negation Extraction and Relation extraction that perform as well or better than bigger prompt-and-generate transformer-based causal language models. In the second part of the thesis, I discuss the modern “prompt-and-generate” approach to natural language processing where both the inputs and the outputs of the model are word-like elements commonly referred to as “tokens”. I explore the nature of token based representation of the input and look at the way token “meaning” is refined at each layer of the successive transformer computation. With respect to the outputs, I explore how people engage with AI generated sequences of tokens that people perceive as “explained” predictions.
dc.publisherMassachusetts Institute of Technology
dc.rightsIn Copyright - Educational Use Permitted
dc.rightsCopyright retained by author(s)
dc.rights.urihttps://rightsstatements.org/page/InC-EDU/1.0/
dc.titleBuilding small domain-specific masked language models vs. large generative models for clinical decision support and their effects on users.
dc.typeThesis
dc.description.degreePh.D.
dc.contributor.departmentMassachusetts Institute of Technology. Department of Electrical Engineering and Computer Science
mit.thesis.degreeDoctoral
thesis.degree.nameDoctor of Philosophy


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record